Bull bear balance: A cluster analysis of socially informed financial volatility

The use of alternative data in financial applications has gained momentum in recent years with the increased availability of data along with computational resources. While traditional financial pricing theory supports an efficient market hypothesis, recent research has shown that data mining of exog...

Full description

Saved in:
Bibliographic Details
Published in:2017 Computing Conference pp. 421 - 428
Main Authors: Manfield, Jonathan, Lukacsko, Derek, Souza, Tharsis T. P.
Format: Conference Proceeding
Language:English
Published: IEEE 01-07-2017
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The use of alternative data in financial applications has gained momentum in recent years with the increased availability of data along with computational resources. While traditional financial pricing theory supports an efficient market hypothesis, recent research has shown that data mining of exogenous feeds can provide further information to inform market activity. Social media has become an increasingly important source of this information due to its abundant, directed, and realtime nature. However, little is known about what combination of social media and financial features is indicative of market activity. In this work, we investigate what combination of social media and financial features are present when social media data is effective for reducing uncertainty about future stock volatility. Moreover, identification of feature profiles from clusters of stocks indicates that sentiment polarity (i.e. positive or negative) taken alone is not enough to infer future volatility, instead a balance of bullish and bearish signals are preferred even above commonly identified features in the literature such as message volume and market cap. This is important because by combining bullish and bearish sentiment and a range of other social and financial variables we are able to generate a time series which is more informative about volatility than any of the individual feature time series. Robustness of these findings is verified across 500 stocks from both NYSE and NASDAQ exchanges. Reported results are reproducible via an open source library for social-financial analysis made freely available.
DOI:10.1109/SAI.2017.8252134