Roadmap

Very similar to the architecture this will lay out where we are and where we are going. As a result, each of the phases will break down where wer are at in each of the categories. For more information on each of the categories go to the phase or the architecture page.

Categories:

  • Data gathering
  • Data Cleaning
  • Data set creation
  • Feature Extraction
  • Predictive analysis
  • Recommender System
  • Display

     

    Current Phase:

    • Data gathering

      • Basic scraping from Yahoo

    • Data Cleaning

      • Basic

    • Data set creation

      • Basic level one data set that combines EOD data into a timeseries and normalizes it to a % change statistic

    • Feature Extraction

      • None

    • Predictive analysis

      • None

    • Recommender System

      • Basic – FCNN4R used to Generate a Model for balancing the portfolio in allocation levels. i.e. how much % of total to put into each available asset

      • Basic - A basic performance analyzer function i.e. how much money did we lose today?

    • Display

      • None

    Next Phase feature priority

    • Data gathering

      • Integrating FED data into datasets

      • Integrating second tier assets. i.e. options maybe?

      • Dead stock data integration. i.e. failed/name changed companies

        • without this data, it'll be virtually impossible to identify sink holes before landing in them

    • Data Cleaning

      • Ongoing as new data sets are integrated, overlap will need to be identified and removed

    • Data set creation

      • continuing the expansion of the base level daily EOD time series datasets

    • Feature Extraction

      • Nothing specific targetted

    • Predictive analysis

      • Integration of the forecast algorithms provided by Hyndmann

        • forecast R package

    • Recommender System

      • Integrate Global search algorithms such as GA into the portfolio mix. i.e. which assets to include in the portfolio.

      • Refining the performance assessment algorithm.

        • So many improvements to be made. Really need an trading expert to provide the levers we can pull.

    Target State:

    • Data gathering

      • All available think tank data and respectable data sources integrated

      • Google level data processing power integrated via Hadoop et al?

    • Data Cleaning

      • Redundant and unnecessary data filtered out

      • Map reduce, filtering, normalization etc.

    • Data set creation

      • Full list will be identified as available sets identified.

    • Feature Extraction

      • Also a TBD – has to be identified as data sets are analyzed and experts in those fields/sets identify useful data points worth chasing.

      • Clustering analytics integrated

      • KNN internal to industry and cross industry

        • spend profiles, forecast profiles, debt profiles, any features of a company that can be utilized to generalize performance/changes in performance to other companies

    • Predictive analysis

      • Open to suggestions but most of the forecasting analysis algorythms are pretty strait forward for strait forecasting ANOVA etc. Potential scope is huge though. 5 day forecast for X assets.. gets big quick.

    • Recommender System

      • Parallel NN integration

      • Feedback loops / time series weighted delays identified

      • Global search algorithms integrated as necessary

    • Display

      • Open to suggestions