RavenPack | May 22, 2018
View an extract of this session held at the London Big Data and Machine Learning Revolution event in April 2018. You can also access the full video and slides.
Asger covers macroeconomics forecasting of Chinese time series using a large number of prediction variables. He investigates what is the extent of improvement of forecasts when news sentiment indexes built using RavenPack data are included among the predictors. The results suggest that forecasts obtained with this method outperform univariate autoregressions and in shorter prediction horizon news indexes improve the forecasts.
How we can do better?
In terms of what we have in regards to big data in the world that tells you something about economic activity. So if you want to predict microeconomic activity, then you need all the data to be infinity precise. What we do is aggregate everything into a few macroeconomics indicators, such as interest rates, balance of payments, production and so forth.
The approach that we are going to take is anchored in the ecometrics. We work with machine learning techniques but I wanted to anchor it to the literature it came from and in that literature, what you have been doing is you take all the macroeconomic indicators that you can get access to, such as federal reserve, IMF, OCD. In our case we have around 130, then you extract the components from that, and then you tuck them into a time series model. Of course you can always use your favourite machine learning technique to do the forecasting.
This is extremely low frequency and these are very well established organisations, that have a rigid way of doing their macroeconomic data.
We are taking all the data from the macroeconomic and trying to extract some information from that. We then augment that with the data sentiments.
The datasets consist of: balance of payments, stock prices, domestic product, foreign relations, government, natural disaster, product services, foreign exchange, consumption, acquisition and mergers, assets, earnings and many more.
After estimating the model, you can see the performance on the left side. Its organised in a way that positive numbers go to the right. Outperforming the benchmark and one without any macro or news and the negative ones are not outperforming. Its split into three parts. The first chunk of results is only using the macro data. The second is combining the macro variable with the lag value of the GDP. The third is only the sentiment.
What you see here is that the value of the news is not as big as before. Maybe this is due to the news being old.
If you do a combination of sentiment and the target then you get more mixed results but it is in general the case that the news is more informative than the macro variables.
When we do research into macroeconomics forecasting, we should take new data sources into account, we should not trust so much on the company data that you get, you have many more sources of information.
Please use your business email. If you don't have one, please email us at info@ravenpack.com.
By providing your personal information and submitting your details, you acknowledge that you have read, understood, and agreed to our Privacy Statement and you accept our Terms and Conditions. We will handle your personal information in compliance with our Privacy Statement. You can exercise your rights of access, rectification, erasure, restriction of processing, data portability, and objection by emailing us at privacy@ravenpack.com in accordance with the GDPRs. You also are agreeing to receive occasional updates and communications from RavenPack about resources, events, products, or services that may be of interest to you.
Your request has been recorded and a team member will be in touch soon.