How Venture Capitalists Use Artificial Intelligence To Better Source Deals And Assess Startups

This article is an overview of the latest developments in AI for venture capital and the emerging ecosystem of solution providers (as of Q2 2018).

Introduction

Inspired by the recent Medium post by Francesco Corea on “Artificial intelligent and Venture capital”, we did not want to miss on the opportunity to throw our two cents and  discuss the status quo of artificial intelligence in the VC industry. It was only few years ago that a fistful of VCs started to experiment with automation and machine learning as part of their internal operations. Today, you see investing firms publishing offers for data science jobs and openly talking about the different ways they use machine learning. But because the VC industry  is dealing with way less quality data than other types of investors, data-driven approaches are hard to implement. Even if the industry as a whole is still lagging behind, recent announcements from big funds pursuing AI-related initiatives are sparking the interest of more traditional startup investors. In addition to internal projects from VC firms, providers of AI solutions for VCs also start to emerge and offer a quicker, more robust path for investors to adopt machine learning.

But how do VCs benefit from artificial intelligence? Let’s look at it from 3 different angles: by player, by area of application and by data type.


1. Perspective by player

1*UAE3wzTXOnawj8OgqVhvqg

Please, note that the number of firms currently employing these applications may be significantly higher. This article is only based only on publicly available information. This is an issue specially relevant for the largest players which are more hermetic when it comes unveiling their dealsourcing and assessment approaches. For instance, GV and Sequoia, have sporadically disclosed to employ data scientists, but never explained in what their efforts translate.

Even considering these constraints, the total invested volume by the reported funds amounted to $9.0Bn by 2017 and to $5.3Bn by 2018 YTD. Their volume of investments added up to around 1,200 by 2017, with an average check size of $7.4M that shrinks to $2M after excluding the 5 largest players.


2. Perspective by area of application.

1*vZUdLmXr-xq_Rgl-oze8kA

Evaluation

Researchers and VCs have long been aware of the many biases influencing their decisions. One of the most prominent being the perpetuation bias by which the applicants that most closely resemble the treats of the idealized entrepreneur are more likely to get funded. The extensive work of Laura Huang proves several examples of these instances. (e.g. “Investors Prefer Entrepreneurial Ventures Pitched by Attractive Men.”). Can data-driven approaches mitigate the difficulty for “humans” to create meaning out of large sets of unstructured information?

Due to the relative scarcity of start-up data and its heterogeneity, most current solutions focus on predicting succesful events. For instance, Hone Capital defines success as the ability for a start-up to raises a Series A round and attempt to predict the likelihood of such scenarios. Following this methodology, they claim to be able to identify companies raising Series A with an accuracy of 40%. This represents x2.5 the industry average, which soars to x3.5 when the results of the model are also filtered by the investment team.

The trade-off always remains data quality vs. data quantity. For example, widely available data in public databases, such as Crunchbase, Pitchbook, Owler, or Dealroom, are in the quantity game. Information is abundant but rarely go into details when dealing with small companies. Data is scarce, sometimes vague and often outdated. It makes for great industry level analysis but not so much at the company-level. Who can blame them? Data collection at this level has to be done manually in most cases. Some players like CB Insights realized it could automate some parts in its data collection process (they claim 70%).

There are no shortcuts to achieve superior data quality. By 2012, Correlation Ventures had already partnered with 20 VCs to access their internal statistics and reached to hundreds of companies manually. They gathered a dataset of 80,000 equity financings in which at least a VC firm participated since 1987. These sources are now benchmarked against the internal data available for each company applying for funding. All applicants are required to submit basic planning, financial history, and legal documents (e.g. term sheets, cap tables). The data is then used in the firm’s analytic models. Their sustained efforts have translated in one of the most automated processes in the industry: Once a start-up scores high based on their criteria, only a single 30-minutes interview in person is needed to make a decision. It reduces the time required for decsion making to an average of 2 weeks.

Differently, WR Hambrecht, a US based investor, focuses on a different kind of question: “How can we better predict when innovations will survive or fail?”. As a result, they claims that factors related to a startup’s operations have a predictive power of 20% and that only 12% is related to team. After 8 years of operations, their model has been accurate on 67% of their predictions, and the funds are estimated to achieve returns over 500% based on subsequent offers over their portfolio companies.

Sourcing

Dedicated AI applications aim at automating and expanding the sourcing processes of VCs to diversify its scope, discover promising startups, and discard dubious ones.

The evaluation solutions are more prominent amongst VC firms. External data service providers, specialized in VC, lead in the sourcing field. The investment required to set up an infrastructure to crawl, homogenize, and maintain various data sources works better when servicing a pool of customers rather than when assumed by a single VC, for itself.

But the most data-intensive VCs did not wait to build their own software solutions. For instance, InReach Ventures, which also has an AI evaluation model, invested $7 Mn to develop its proprietary software, assuming maintenance costs over $1 Mn per year. As of December 2017, its suite of AI applications allowed it to evaluate 95,000 European start-ups, and later screened a sample of 2% that were a good initial fit.

Theoretically, the technical process is straightforward. A combination of public and private data sources is first selected, then crawled, consolidated, and finally filtered by investment criteria. The variety of databases in use may turn out to be differential in some cases. For instance, the seed fund SignalFire disclosed that they “collect data from patents to academic publications to open source contributions to financial filings”. SignalFire’s GP also declared to invest in private raw consumer transactions data. Apart from applying these to discover new hidden gems, they opened their data platform to 50 third parties in exchange for filling roles as on-demand advisors to their portfolio companies.

In general terms, these AI processes normally entail data crawling modules (i.e. to map, monitor and extract unstructured sources of data), identification modules (i.e. to homogenize and consolidate company references and understand relations within the start-up network) and clustering modules (i.e. to group and categorize similar players, industries or news). Once these processes are implemented, VCs experienced a noticeable increase in the quantity and diversity of the leads sourced. Fly Ventures claims to discover 1,000 new start-ups per week. Right Side Capital has been able to invest in 850 companies since establishing a new data-driven approach in 2012, allowing the fund to reduce its check size below $300k. Social Capital presents even a lower average allotment, amounting to only $70k per investment, and its recent investments are distributed accross 24 countries with 80% of startups being led be non-white founders.

Value-added

First, feedback solutions have been developed to provide ad-hoc recommendations for benchmarking start-ups vs. competitors. Solutions deploying this kind of applications are usually present either in one of the two previous categories, as they leverage data sources already gathered to tackle sourcing and evaluation solutions. Roberto Bonanzinga of InReach Ventures, explained the synergy: “By better clarifying which data best translates to successful startups, VCs can educate current and future entrepreneurs”.

Startup Compass, for instance, defends that start-ups should grow proportionally amongst each of its dimensions (team, product …). They developed a tool to warn and guide those start-ups prematurely scaling. Hone Capital aims at guiding entrepreneurs with recommendations based on concrete success metrics, that may have been tested while sourcing and evaluating their leads.

Within training solutions, we have broadly grouped those solutions that not only give feedback to entrepreneurs, but go one step further and deliver the means to improve portfolio companies’ performance. Some of most disruptive solutions can be found in this category.

First, there exist funds supplying data to train and validate the business models proposed by their promising AI investees which has been suffering from data hunger since its inception. For Gradient Ventures, the most recent early-stage Google fund, data was already there. Usually, it is not that easy. The team at Georgian Partners has found out a way of achieving similar results without that playfield advantage. They achieved it using differential privacy techniques where portfolio companies can pool, share, and anonymize proprietary data and contribute from shared insights.

In addition, external providers are also showing very creative solutions. Pitchbot.vc is offering a conversational automated bot that helps founders become better at pitching by replying questions from a bot. The bot can play the role of an incubator, a seed fund or a high-profile VC.

Lastly, matching solutions also exist for start-ups, offering them the possibility to find  investors that are investing in similar startups at the same stage. For example, Dorm Room Fund offer startups and investors alike a list of prospects based on company description, industry, and location.


3. Perspective by data source.

1*5ow5OoQdLIOei4mC-vMnbg

Cross-functional data

Every now and then we spot global data providers expanding their financial data services to VCs. Given that it may be in the interest of venture capitalists, we have considered a wider range of global data providers. Most data solutions cater to different kinds of investors including private equity funds, hedge funds, lenders and large corporates.

  • Digital footprint: Twitter, Facebook, App Store, Web traffic, web forums… Probably the most extensive source of information, especially for B2C companies, with the challenge of extracting and transforming it in useful and understandable insights for investors. Some interesting examples here are iSentium and Dataminr. The former scraps Twitter posts, identify keywords expressing positive or negative sentiments and lastly ranks each companies’ sentiment on a quantitative score. Following a similar philosophy, Dataminr lively monitors social network activity to immediately alert of sentiment-changing events.
  • Financial information: AI has enabled techniques to look at the traditional financial sources on a more innovative and timely manner. On the processing side, Kensho has developed a platform that crawls publicly available company data to help answer financial user queries instantly. On the analytical side, Prattle is discovering insights hidden deep in traditional central bank reports and company earnings calls. They do sentiment analysis based on grammatical structures, nuanced wordings and tones in use.
  • Consumer data: A few companies, such as Earnest Research, specialize on acquiring consumer data, consolidating it and extracting insights on consumer behavior. The CircleUp team, which tries to predict likelihoods of breakout success for over 1.2M US retail companies claims that this sector is uniquely positioned to benefit from data treatment techniques. Its CEO declared that: “The business models of retail companies are very similar. Whether a company is selling dog food, shampoo or water. Second, there’s endless data on consumer product and retail companies“.
  • Satellite images: AI image processing and object recognition techniques are claimed to enable the treatment of satellite imagery to estimate granular economic and demographic metrics, substituting to some extent more traditional economic indicators. For instance, SpaceKnow launched in 2016 an index to monitor industrial activity in China. It claims to process 2.2 Bn satellite observation points and individually monitoring around 6,000 industrial facilities to do so.
  • Team background and dynamics: Finally, there exist a set of venture companies that based on team data seek either to support team dynamics or to evaluate which companies are most likely to thrive. AiNgel, is specialized in scoring companies using data such as educational background, employment history, entrepreneurial experience and personality traits.
  • Transactions data: Credit card transactions are an highly fragmented and unstructured data type, but also the most granular to understand consumer behavior, trends and expenditures. This kind of data has been extensively pooled and analyzed by Second Measure. They claim to access an anonymized selection of 2–3% of all credit card transactions in US down to the store level.

Conclusion

A few years back, the deployment of AI applications constituted a bold and novel bet for VCs with the promise of better dealflow. The point we are touching here, following up on Francesco’s article, is that an increasing number of VC firms are looking at developing their own in-house software to digest and analyze an increasing amount of startup data. However, it’s important to remember not to lose focus dealing with non-core activities, or as Cassie Kozyrkov puts it “Are you in the business of making bread? Or making ovens?”

At PreSeries, we have designed a framework upon which you can automate start-up dealsourcing and assessment efforts. We built our platform to take full advantage of both public and proprietary investor data, while keeping all private information secure. Feel free to get in touch, we’d love to show you how it works.

Arturo MorenoCEO – Twitter

Fabien DurandProduct & Marketing – Twitter

Alfonso PalomeroStrategy Intern – LinkedIn

 

Sources

Entrepreneur. February 6th, 2018. Here’s how AI is changing VC funding – https://www.entrepreneur.com/article/309198

McKinsey&Company. June 27th, 2017. A machine learning approach to Venture Capital – https://www.mckinsey.com/industries/high-tech/our-insights/a-machine-learning-approach-to-venture-capital

Forbes. October 2nd, 2015. Reimagining VC Investing: How Correlation Ventures is Attracting and Keeping the Best New Startups – https://www.forbes.com/sites/mnewlands/2015/10/02/reimagining-vc-investing-how-correlation-ventures-is-attracting-and-keeping-the-best-new-startups/#2c4bca393929

XConomy. March 13th, 2018. Too many venture capital cooks in the kitchen – https://www.xconomy.com/san-diego/2018/03/13/too-many-venture-capital-cooks-in-the-kitchen/

Medium. March 16th, 2017. Introducing the U.S. Venture Exit Year Index by Correlation Ventures – https://medium.com/correlation-ventures/u-s-venture-exit-year-index-by-correlation-ventures-1bf98d1077a9

Wall Street Journal. January 13th, 2012. Correlation ventures raises 165M$ for data focused investment approach. https://blogs.wsj.com/venturecapital/2012/01/13/correlation-ventures-raises-165m-for-data-focused-investment-approach/

Boston Biotech Watch. January 17th, 2012. Quant VC Correlation Ventures new “Dream Date”. https://bostonbiotechwatch.com/2012/01/17/quant-vc-correlation-ventures-vcs-new-dream-date/

PEHub. January 17th, 2012.. Correlation Ventures Closes $165M Fund That Will Use Predictive Analytics – https://www.pehub.com/2012/01/correlation-ventures-closes-165m-fund-that-will-use-predictive-analytics/#

Valor. July 19th, 2012. Are micro VCs boosting along short-term entrepreneurs? – http://vator.tv/news/2012-07-19-are-micro-vcs-boosting-along-short-term-entrepreneurs

Fortune. August 5th, 2015. Could algorithms help create a better venture capitalist? – http://fortune.com/2015/08/05/venture-capital-hits-average/

Fast Company. November 19th, 2013. This prediction algorithm can tell if your start-up will fail – https://www.fastcompany.com/3021903/this-prediction-algorithm-can-tell-if-your-startup-will-fail

Financial Times. December 11th, 2017. Artificial intelligence is guiding venture capital to start-ups – https://www.ft.com/content/dd7fa798-bfcd-11e7-823b-ed31693349d3

Techcrunch. October 22nd, 2015. Watch out, VCs: Chris Farmer says he’s about to massively disrupt the industry – https://techcrunch.com/2015/10/22/watch-out-vcs-chris-farmer-says-hes-about-to-massively-disrupt-the-industry/

Medium. October 30th, 2015. Venture capital disintermediation is coming. https://startupsventurecapital.com/introducing-global-beta-ventures-ad49dd7bebd0

Pitchbook. March 15th, 2018. Data driven investing: Why “gut-feeling” may no longer be enough? – https://pitchbook.com/news/articles/data-driven-investing-why-gut-feel-may-no-longer-be-good-enough

The news stack. May 14th, 2018. Could data-based, human-free investing eliminate bias? – https://thenewstack.io/could-data-based-human-free-investing-eliminate-bias/

Bloomberg. May 1st, 2018. Impress the Algorithm. Get $250,000 – https://www.bloomberg.com/news/features/2018-05-01/white-male-vcs-tend-to-fund-white-male-entrepreneurs-could-robots-do-better

CNBC. July 10th, 2017. Google will invest in AI startups and send its engineers to help them out for up to a year – https://www.cnbc.com/2017/07/10/google-launches-gradient-ventures-to-invest-in-a-i-start-ups.html

The Globe And Mail. April 6th, 2018. Georgian Partners rewriting the rules for venture capitalists as it closes in on record Canadian fund – https://www.theglobeandmail.com/business/article-georgian-partners-rewriting-the-rules-for-venture-capitalists-as-it/

Techcrunch. January, 2018. Dorm Room Fund has built a CRM for founders raising a seed round – https://techcrunch.com/2018/01/25/dorm-room-fund-has-built-a-crm-for-founders-raising-a-seed-round/

Digital Globe. March 30th, 2018. Spaceknow: Using GBDX to bring transparency to the global economy – http://blog.digitalglobe.com/technologies/spaceknow-using-gbdx-to-bring-transparency-to-the-global-economy/

J.P. Morgan. May, 2017. Big Data and AI strategies: Machine learning and alternative data approach to investing.

Solr cluster for fast searches in PreSeries Datamarts – Under the hood #1

This article is the first in a series of technical deep dives where we focus on the engineering challenges of building a platform like PreSeries. Today, we explain how we drastically improved the speed of search queries performed by PreSeries’ users.


One of the problems that we were facing was the low performance we were getting from our Mongo replica set when we tried to query big tables (tables with more than 30 millions of documents) using multiple conditions.

It only got worse when we tried to run searches using keywords or free text. Every time that we wanted to use a new search field, we were forced to create multiple indexes that considerably increased the volume of the database. And there is a limitation on the number of indexes that you can create on a mongo collection (close to 72), making it almost impossible to manage all the possible scenarios we want to address.

Eventually, the solution we adopted was inspired by looking at how the DataStax™ Enterprise Platform™ works. We were inspired by DataStax’s use of Solr as a search backend to complement the power of Cassandra databases, thus allowing the creation of Solr indexes in Cassandra tables.

If you take a look at their architecture, you can see that they use Apache Cassandra as its core database engine in conjunction with a Solr service to manipulate sophisticated search queries without compromising the service performance. That makes sense, because managing queries in Cassandra is much harder than in Mongo. Sure, you have access to secondary indexes in Cassandra to query fields that would normally not be queryable, but at the expense of performance, which would be severely affected.

What is Solr?

Apache Solr™ is a highly reliable, scalable, and fault-tolerant distributed indexing solution with the latest search and index technology. Solr is a search engine at heart, but it is much more than that. It is a NoSQL database with transactional support. It is a document database that offers SQL support and executes it in a distributed manner. Solr enables powerful full-text search and near real-time indexing.

At this point, we knew we wanted to bring the Solr impressive features to the PreSeries Datamarts in order to improve the performance of our queries, but how? Our main concern was how to maintain and synchronize the collections from our Datamarts with the Solr service automatically, and how to do so as quick as possible. Eventually, we found the solution in the mongo-connector project.

In the following figure I am representing the Solr architecture, formed by a cluster of nodes. We can see how Solr splits the data into shards, distributes the shards between the nodes based on the number of shards per node we have configured, and maintain replicas of these shards to assure reliability using a replication factor that can be established by core (SolrCloud’s index).

Solr-Architecture.png

Figure 1- Architecture of a normal Apache Solr™ deployment

What is mongo-connector and how it works?

The mongo-connector is a Python utility that creates a pipeline from a MongoDB cluster (replica set) to one or more target systems, and one of the available targeted systems is, happily, Solr. Mongo-connector is modular, it is based on a core library and independent managers that know how to talk with the different target systems. In our case, the manager required by us was the Sorl-Doc-Manager library.

Below, is the architecture of the solution with all the processes between Mongo and Solr:

PreSeries-Solr-Architecture.png

Figure 2- Architecture of the MongoDB to Solr synchronization

With mongo-connector, we can synchronize one or multiple mongo collections into one unique Solr core. The question that arises is the following: how can we differentiate between the data of different collections if we put everything in the same core/index? The answer is easy, the library adds a special field named ‘ns’ (namespace) that will contain the name of the original mongo collection. This field will become a discriminator field that will be able to use to filter results of the original collection we are interested in.

In the following figure we can see how to synchronize three MongoDB collections that are holding the insights generated for the companies we are tracking, into one unique Solr collection. Each of these original collections maintains a view of the companies, applying different periodicities: weekly, monthly, and yearly.

PreSeries-Solr-Collections.png

Figure 3- Synchronized data from MongoDB to Solr

Let’s breakdown this chart and highlight hte main parts:

  • On the left we see the structure of the mongo collections that will be synchronized with Solr. We can see different type of fields that goes from strings, to decimals, or arrays of strings.
  • In the middle, we can see the syntax of the command we should run in order to start the synchronization process. This process is responsible for checking the mongo OptLog collection in order to detect all the changes in the mongo database, and apply those changes into the Solr collection as soon as they have been detected.
    • What is the meaning of the optlog-ts parameter? the process will maintain updated the informed file (companies_collections.log) with the timestamp of the latest mongo change indexed into Solr. The sync process will use the value saved in this file as the start point for future synchronizations.
    • The auto-commit-interval parameter has a big impact in the performance of the entire process. It will directly affects the performance of index updates. This parameter informs Solr about how often we need to flush the changes into the disk. In our case, our ETLs do changes that affect millions of documents every time they are executed. Therefore, this actions will provoke millions of operations in Solr, which implies a lot of changes in the index.
    • The -n parameter allows us to inform about the collections in mongo that we want to synchronize into the Solr core/index. In our case, we are talking about three different collections.
    • And finally, we inform about the core/index in Solr (companies_data) that will hold all the data that will come from mongo.
  • On the right-side of the image we describe the schema of the destination collection in Solr. We can identify 3 new fields in this new collection that are: _TS, NS and ID.
    • _TS: this field is automatically generated by mongo-connector, and it maintains the date at which the original document was created or updated in mongo.
    • NS: this field maintains the reference to the original collection name in mongo. In this case, this field could contains one of the following values: [“preseries_db.companies_data_weekly“, “preseries_db.companies_data_monthly“,  “preseries_db.companies_data_yearly“]
    • ID: this field maintains the original mongo _ID of each document.

The queries

In Solr, we can conduct queries of different types:

  • Advances queries, using multiple clauses and comparators. Fuzzy searches.
  • Nested queries. Nest an arbitrary query type inside another query type.
  • Faceting, that give us our category counts, among other things.
  • Range Faceting, that give us the ability to divide a numeric fields into a categories of ranges. It allows us discretize numerical fields and analyze the range counts, among other things.
  • Boosting, that allows us to score higher some fields over others.
  • TF-IDF and BM25, Solr uses the TF-IDF as its scoring algorithm. TF-IDF for “term frequency versus the inverse document frequency.” It returns how frequently a term occurs in your field or document versus how frequently that term occurs overall in your collection. This algorithm has some issues, issues mitigated by using the BM25 algorithm. An algorithm that smoothes this process, effectively letting documents reach a saturation point, after which the impact of additional occurrences are mitigated.
  • Grouping and Aggregations
  • Obtain statistics about the fields: quantity/count, missing, max, min, average, number of distinct values, list of distinct values, the more frequent values.

These are some examples of the kind of queries we are able to do:

Queries_Stats.png

Figure 4- Field statistics query

Queries_Faceting.png

Figure 5- Faceting query

Queries_RangeFaceting.png

Figure 6- Range Faceting query with complex filtering

Queries_PivotFaceting.png

Figure 7- Pivot Faceting or Decision Trees

Queries_Aggregations.png

Figure 8- Aggregations

Issues faced along the road

Not everything has been quick and easy.

The main issues we have encountered were related to the process of maintaining both of the backends synchronized once the initial synchronization was made.

After the initial synchronization, that is made using bulk operations to speed up the process, the process becomes very slow. It happens because mongo-connector is processing changes one by one. Every time a change is detected, the connector loads the data from mongo, transforms the mongo-document into a solr-document making use of the schema published in Solr schema, and finally it sends the operation to Solr. No bulk operations were done!

Another issue was that “updates” were very slow, taking days to finish. The Solr Manager was doing a two-step process by document, the first step was to load the data from the solr-backend to check the existence of the document, and the second step was to send back to Solr the updated version of the document. In this scenario, bulk updates aren’t possible.

The solution we found to that was to clone the original mongo-connector project from github, and make some changes in the implementation.

These are the changes we made in the code:

  • We do bulk inserts and updates when possible. We load and process as many changes as possible and we send them to Solr in a bulk.
  • The first step in an update is to request the current version of the document in Solr, to apply later the changes in the loaded doc. Now, we are requesting multiple docs at the same time, instead of one by one. This is a bulk read. We reduce in thousands the number of queries we need to send to MongoDB.
  • We use a high auto-commit-interval, around 20 minutes or so, to avoid flushing changes every time a document (or a bulk of documents) is sent to Solr. We need to take into account that we make millions of changes in mongo every time we execute an ETL, so this has big impact.

Final results

The results are stunning! As we can see in the next image, we went from response times measured in seconds in MongoDB to respond times measured in milliseconds in Solr, both accessing collections with more than 50 million of documents.

Our API is now also able to solve almost all the queries in a second. We have reduced considerably our response times, and this is allowing us to serve many more requests, in considerably less time, and using less resources.

We have been able also to reduce considerably the size of our MongoDB replica set., removing most of the indexes that we were needing to solve queries with multiple criteria.

Results of the performance tests:

Screen Shot 2018-05-09 at 12.07.29 PM.png

Some explanation about the queries:

  • By ID: a simple query where we look for the data of an specific company by its ID in an specific snapshot (we generate snapshots every week about the details of the companies).
  • By Keywords: we look for the companies that contained the terms “Machine Learning” or “Software” in any of the company textual fields.
  • By Complex Criteria: we look for the companies that matched the following criteria:
    • Founded after 1st January 2010.
    • In “running” or “acquired” status
    • With a score greater than 40%
    • Organizations that are only companies (we have also schools and groups indexed in the database)
  • By Keywords + Complex Criteria: the two previous queries together.

Want to build your very own startup deal sourcing & assessment platform with PreSeries? Get in touch here!

PreSeries Predicts! Our A.I.’s Ranking Of The Top 10 Startups in FinTech

Originally published on Medium

PreSeries predictive algorithms crawl the web hungry for startup information. So far, almost 400k companies have been ruthlessly processed, scored, and ranked. Today, we offer you a sneak peek at the PreSeries Dashboard and our latest ranking of the Top 10 startups in FinTech from around the globe.

Top 10 Startups in FinTech

PreSeries’ Company Ranking — FinTech (March 9, 2018)

Do you agree/disagree with the ranking? Let us know your thoughts on Twitter with the hashtag #PreSeriesPredicts

1 — Nubank

Nubank is the leading fintech in Latin America. Using bleeding-edge technology, design and data, Nubank is committed to fighting complexity and empowering Brazilians to take control of their finances. Over 8 million people have applied for its mobile-controlled credit card since its launch on September 2014. Located in the Pinheiros region of São Paulo, Nubank has raised USD 180 million in investment rounds led by Sequoia Capital, Founders Fund, Tiger Global, Kaszek Ventures, Goldman Sachs, QED Investors and DST Global.

PreSeries’ overall scoring of Nubank over time

2 — StreetShares

StreetShares offers unique financial solutions for America’s heroes and their communities. StreetShares’ technology captures the social loyalty that exists within the military community and harnesses that trust to lower risk in financial transactions. StreetShares provides a suite of specialty finance products focused on the military and veterans market, including small business funding, lines of credit, and alternatives to VA small business loans for vet-owned businesses. StreetShares is also a factoring company offering invoice factoring and account receivables financing for the government contract (GovCon) community, as well as the StreetShares Patriot Express® program. StreetShares offers alternative investments, including a veterans social-impact investing product called Veteran Business Bonds. StreetShares is veteran-run and located outside of Washington, D.C.

PreSeries’ overall scoring of StreetShares over time

3 — Bond Street

Bond Street is a startup focused on transforming small business lending through technology, data and design. Small business owners are the foundation for growth in our economy, and yet today’s banking system has left them behind. They’re building a better future where access to financing is simple, transparent and fair. They’re backed by a renowned group technology and financial services investors and are building a world-class team in New York City.

PreSeries’ overall scoring of Bond Street over time

4 — Shufti Pro

ShuftiPro is a digital identity verification application that offers an intelligent mechanism for the verification of ID card, Passport, Driving License, and Credit/Debit card. Their mission is to enable businesses eliminate customer’s identity frauds and increase their profitability. ShuftiPro is an online verification application designed to minimize online identity frauds while providing businesses a viable solution to trim down the risks involved while maintaining KYC. It has specifically helped online merchants in making online transactions secure. ShuftiPro is currently supporting 150+ languages in more than 150 countries.

PreSeries’ overall scoring of Shufti Pro over time

5 — Bancor

Bancor Protocol™ is a standard for the creation of Smart Tokens™, cryptocurrencies with built-in convertibility directly through their smart contracts. Bancor utilizes an innovative token “Connector” method to enable formulaic price calculation and continuous liquidity for all compliant tokens, without needing to match two parties in an exchange. Smart Tokens™ interconnect to form token liquidity networks, allowing user-generated cryptocurrencies to thrive. For more information, please visit the website and read the Bancor Protocol™ Whitepaper.

PreSeries’ overall scoring of Bancor over time

6 — Cadre

Cadre provides superior access and insight to the universe of alternative investments. Founded in 2014 by Ryan Williams, Cadre is a marketplace where investors benefit from greater transparency, actionable information, lower fees, and more flexibility. The company’s innovative technology drives efficiency and powers insight for its participants. Cadre has raised approximately $135M in funding and transacted on several hundred million dollars worth of investments to date.

PreSeries’ overall scoring of Cadre over time

7 — Proplend

Proplend’s FCA approved peer to peer lending platform connects investors direct to creditworthy borrowers — enabling investors to earn attractive returns and borrowers to gain access to funding that may not otherwise be available. Investors choose which loans and borrowers to lend to, investing to the loan to value-based risk tranche(s) they’re comfortable with. All loans are secured by income producing UK commercial property.

PreSeries’ overall scoring of Proplend over time

8 — Spotcap

Spotcap empowers small business owners with tailored finance, allowing them to focus on what really matters — their business. The company assesses the real-time performance of businesses to grant short-term credit lines. Headquartered in Berlin Germany, Spotcap launched in Spain in September 2014 before expanding to the Netherlands and Australia in 2015, the UK in 2016 and New Zealand in 2017. The company is led by Founder and CEO Jens Woloszczak. The growing team currently consists of more than 120 employees globally. Spotcap is backed by a number of world-class investors including Rocket Internet, Finstar Financial Group, Access Industries, Holtzbrinck Ventures and Heartland Bank.

PreSeries’ overall scoring of Spotcap over time

9 — Octane Lending

Octane Lending is a point of sales finance and insurance marketplace that helps salesmen help their customer’s obtain financing. They are currently focused on the recreational market (motorcycles, ATVs, UTVs, Personal watercrafts, boats, RVs and snowmobiles). Their web based platform helps dealers save time by eliminating the need to rekey customer information and helps move more units by opening dealerships to more prime/subprime lending sources. They leverage their large merchant network to act as an efficient compliment to lenders’ existing loan origination systems.

PreSeries’ overall scoring of Octane Lending over time

10 — Kasisto

Kasisto leverages decades of research and development in artificial intelligence. KAI Banking enables financial institutions to add virtual assistants and smart bots to their mobile apps and leading messaging platforms. With an emphasis on great user experience, KAI-powered virtual assistants and smart bots are easy to implement, customize and maintain.

PreSeries’ overall scoring of Kasisto over time

 


Love PreSeries AI-driven rankings? Stay tuned, follow us at @PreSeries & #PreSeriesPredicts

Want to give PreSeries a go? Get in touch here!

Startup raises £100,000 from an AI-powered VC in seconds, live on-stage!

Originally published on Medium
7th AI Startup Battle at PAPIs ’17 (Boston, Oct. 2017)

Boston, October 2017… I thought I heard a mosquito fly. The room was packed, but everyone yet remained silent. All were focusing their attention on a little device on-stage. A small black vertical cylinder was stealing the show, before it even started. “Is it … an Amazon Alexa?” said someone, and curiosity was quick to fill the whole room. Everyone had a theory about why a voice-assistant was being prepped like an athlete ready to enter the field. Excited but yet a bit more worried were the people in the front row, startup founders. They knew what was about to happen. An AI was about to judge their startups live on-stage, and in the end, choose a winner. The tension was palpable, “you can read a crowd of ‘human’ investors, but how do you approach an AI”?

Suddenly, a deep voice arose from the stage: “Alexa, ask PreSeries to start the Battle”. A mere second later after the host’s command, the little black cylinder was awake and hungry for startups to judge. “I am ready to score the first startup” it said calmly.

Startup founders then took turns, 1–2 minutes each on stage, to answer questions from the PreSeries AI. The questions revolved around team composition, founders’ background, proprietary technology or industry characteristics.

PreSeries in action!

“Ok, that’s everything I needed to know” said PreSeries. It only took 8 minutes to collect and evaluate the necessary information to determine, among 5 contenders, which one is most likely to succeed. “Ask PreSeries who is the winner?” commanded the host. “The winner is GreenSight Agronomics” replied the AI. One of the attendants, a local VC, said “the funny thing is that, the winner was my favourite!”

The next AI Startup Battle will be at PAPIs Europe (April 5 — London)

Impressed by an AI thinking like a VC? Now remember about the last time that you missed on a deal because your process was too slow… Maybe you took a lot of time to learn about that company… Or maybe your decision making process was too long because you needed to gather more data … The problem here is our own limitations as investors. Spotting the signals that are good predictors of success for startups is not easy. Because we’re dealing with such uncertainty, the amount of data needed to decently derive startups insights and uncover actionable trends is considerable. Fortunately for us, machine learning (ML) provides the answer.

We are witnessing the adoption of ML across the entire venture capital industry at an ever-accelerating pace. Some VC firms started a while ago to use machine learning, but most of them are just taking their first steps. From data collection, processing, to predictive insights … the preachers of disruptions are the ones being slowing disrupted (Google, Hone Capital, Fly.vc, InReach Ventures, Sequoia, Kleiner Perkins, etc.)

If you invest in startups for a living and feel that you are not doing any of this …

We created PreSeries with the vision of a faster, more efficient and transparent process to allocate resources in the startup community. We do the heavy lifting when it comes to data collection and predictive modeling.

The more relevant data you feed machine learning models, the better the quality of the analysis. By using machine learning in lieu of manual guess-estimates (read “spreadsheets”) to evaluate startups, you not only address the breadth of information you can handle, but you’re also able to automate and drastically reduce the cost of the whole process.

We built PreSeries for that purpose. We want startup investors to concentrate time and money where it matters, not on technical tasks like collecting, processing, or evaluating data. Startup scouting and assessment should be like breadcrumbs in your budget. Let us show you how!

We argue that by liberating time from scouting and screening, you can spend time on helping your portfolio companies, which is what we believe venture capital will be all about in the future. (Check out Hunter Walk’s VC time distribution to get a better idea of current time distribution.)

Let our AI invest £100,000 in your startup!

We recently announced that London-based AI Seed venture capital fund will be investing £100,000 in the winner of the next AI Startup Battle that will take place on April 5, 2018 in London at PAPIs.io Europe. And as you already know: No humans in the jury. More info and prizes details here!

>> APPLY NOW <<

The AI Startup Battle arrives in London — Apply now! (Pss: our jury fits in a box!)

Orginally published on Medium

Human v.s. AI has always been a major theme in science fiction. Our love for storytelling and fear of unchained technology still feeds the idea that one day we will loose control over the machines we built. In reality, machine-learning is a bit more boring and mankind is still far from being doomed. We all had dreams of human-like androids and sentient computers, but instead we have Roomba® vacuum cleaners and trust issues when using Google Translate.

The Singularity is nigh!

Instead of preaching the singularity we therefore prefer to concentrate on a friendlier sub-domain of AI: Machine Learning for Startup Investing. Overall, the financial industry is one of the largest benefiting from AI. Machine Learning coupled with vast amount of information helped change the mantra from “more data is better” to “actionable insights are better”. Nonetheless, the venture capital asset class is still playing hard to get. In order to solve this issue firsthand, we decided to build a Machine-Learning-as-a-Service (MLaaS) platform that collects information from a vast set of sources and then generates automated real-time insights & scores about startups and their industries. Startup investors can now easily make data-driven investment decisions. More on our website.

A “VC-in-a-Box” & one-of-a-kind startup battle!

Better insights make better investors, but great investors are able to take decisions in a very short amount of time. Therefore, why not put our AI to the test where startup investors usually don’t have much time to assess potential investments: startup competitions! For this very reason, 3 years ago, we launched a series of international startup competitions where our AI is tasked to replace a whole “human” jury. #HumansNeedNotApply. We call our events the AI Startup Battles and they’ve already been featured in Spain, Brazil and the United States (check our blog). In our competitions, contenders interact live with our AI through a little voice-enabled device present on stage (hence the name “VC-in-a-Box”).

PreSeries’ voice interface is powered by an Amazon Echo

Our AI asks the participants a set of questions looking for fact-based insights as well as many other investment and traction related metrics. It imitates human investors in the types of questions it asks but the breadth of information it can access in real-time is unmatched. PreSeries’ predictive models are trained with a diverse set of public and private data on more than 370,000 companies worldwide and can derive actionable insights in matter of seconds.

We believe that our AI can do a great job at judging the potential of an early-stage startup looking at features around team composition, its industry, its technological edge and its funding history. There is nothing more entertaining than having an data-driven algorithm decide of your fate! Don’t you agree

Note: If you are organizing startup competitions and are ready to bring it to the next level, get in touch with us and we’ll help you organize your very own AI Startup Battle using PreSeries. Let’s talk at battle(at)preseries(dot)com!

AI Startup Battle @ PAPIs Europe

If your heart is pure and you are not afraid to face our digital mastermind, I warmly invite you to apply to participate in the 8th edition of our AI Startup Battle in London (April 5th 2018). The competition is hosted by PAPIs Europe, an international conference on the latest innovations to create real-world Machine Learning applications. It features amazing demos and talks by renowned international experts in AI.

Up to the challenge? Then, fill in the application form available HERE before March 5, 2018! If your startup uses AI as a core enabler, this battle is your chance to get under the spotlight. The startups selected to compete will be able to pitch on stage, make connections, exhibit at PAPIs Europe, and get unique exposure among a highly distinguished audience.

Our co-organizers:

  • PAPIs — PAPIs is the 1st series of international conferences dedicated to real-world Machine Learning applications, and the innovations, techniques and tools that power them. It combines the best of industry and academic conferences, with its expert committee of reviewers and its publications in PMLR. PAPIs is also committed to increasing diversity in ML by making it easier for certain groups to attend conferences.
  • BigML — BigML is the leading Machine Learning company that pioneered the creation of Machine Learning as a Service (MLAAS). BigML offers a consumable, programmable and scalable Machine Learning platform that makes it easy to solve and automate Machine Learning tasks. BigML’s platform helps tens of thousands of analysts, software developers, and scientists from organizations of all sizes and industries to transform data into actionable models.
  • AI Seed — AI Seed offers investment and support for the next generation of Artificial Intelligence founders. They invest early in prime movers to leverage proprietary power and harness data network effects across the value stack. They aim to identify exceptional technical teams using AI to re-imagine industries. They act as trusted partners and understand founders are the ones building their company, they are simply there to help them reach their full potential. Whilst they strive to use data to inform their investment process, ultimately they are founders backing founders and will always be aligned.
  • Telefónica Open Future — Open Future is Telefónica’s global program of entrepreneurship and investment, aimed at attracting innovative products, services and talent to the Company, with the goal of integrating them into our value proposition for customers.It includes several initiatives of proven success, such as the Telefónica Ventures and Amérigo investment funds, the start-up accelerator Wayra and the Think Big and Talentum initiatives

Prize: £100,000 investment by AI Seed & access to Wayra (Telefónica’s incubator)

AI Seed is offering to reward the winner in the form of a £100,000 investment. The investment is subject to approval from AI Seed and must meet the following conditions: (1) Satisfactory completion of Company due diligence, (2) Satisfactory completion of Founder due diligence and anti-money laundering checks, (3) All employees having entered into employee contracts in a form acceptable to the Fund, (4) Receipt of a disclosure letter with respect to warranties (wherein a general disclosure of a data room is unacceptable) (5) Receipt of agreements assigning IP rights to the Company in a form provided by the Fund and executed by each employee, (6) Receipt of board minutes recording Company changes, (7) Completion of shareholder resolutions adopting Company changes, and (8) Payment of the Fund’s fees and expenses by the Company.

The winner of the battle will also have the chance to participate in Telefónica Open Future _ Program. In this respect, the winner may have access, up to six months, to Telefónica Open Future’s pre-acceleration services, subject to space availability (desk space and connectivity). After the six-months of pre-acceleration, the winner will be evaluated and, in case of a positive evaluation by Telefonica Open Future, the winner may have access to Wayra’s Acceleration Program. Wayra Acceleration Program offers financing for up to 50.000$ in the form of a convertible note and acceleration services, for a maximum period of 12 months, valued in a maximum of 70.000$, subject to the fulfillment of certain milestones agreed with Wayra (in the form of physical co-working space for the team, connectivity services, access to its network and know-how, consultancy services, entrepreneurship training, access to the Wayra network of potential investors, other entrepreneurs and practitioners from the venture capital industry).

May the data be with you!

How to organize an entertaining startup battle with your PreSeries AI judge

Originally published on Medium

Today, startup competitions, idea pitches, and demo days are everywhere. If you are an entrepreneur looking for the best place to promote your business, event scouting can become a full-time job. But hey, better drown in opportunities than begging for attention, am I right? If we’re looking only at 2017, there were almost 4,000 startup-related events around the globe (source: Crunchbase). But why so many of them? Well, startup competitions, demo-days, and the like are a very popular way for event organizers to increase the outreach of an event/conference by playing the “entertainment” card, and a great way to get some sponsorship. Let’s assume that 50% of all these events held some kind of startup competitions (conservative coin toss), it’s around 2,500 startup competitions for 2017 only. When you take into consideration that it easily brings media coverage you understand why everyone is fond of them, from entrepreneurs to event organizers, and of course, investors.

While we agree that the participation of startups makes events more entertaining and engaging, we don’t see any kind of innovation happening in how startup competitions are being done. Which is surprising coming from a community of disruption-lovers. The format is always the same, first the startups pitch, then the jury asks some questions, and last but not least a winner is selected.

At PreSeries, we spend every minute of everyday thinking about AI and startup investing. And seeing the format of startup competitions untouched by disruption was making us sad! We had to do something! In an effort to solve this problem, few years ago we launched, in collaboration with the PAPIs conferences, our own series of events. We call them the “AI Startup Battles” and our mantra is “Not yet another normal startup competition”. This time, the entire jury is being replaced by PreSeries very own AI. Our AI does the job of engaging in live Q&A sessions with the contenders on stage and is able to extract real-time insights thanks to machine learning. It leverages a database of more than 370k companies to compare with. Our very talkative AI is able to discuss with the contenders via a custom Amazon Echo present on-stage, making it a very space-efficient jury as it can fit in a small box.

We believe that our AI can do a great job at judging the potential of early-stage startups by looking at features around team composition, industry, technological edge and funding history. We think there is nothing more entertaining than having a data-driven algorithm decide your fate in real-time! Don’t you agree?

Maybe you need to see it to believe it? Then, if you are in London in April, you better not miss the 8th AI Startup Battle at PAPIs Europe (April 4).

As an event organizer, you can also bring the AI Startup Battles to your event, lucky you! We can help you transform your startup competition into a unique AI-driven moment that your guests won’t forget. For more info, get in touch at battle(at)preseries(dot)com and we’ll make sure that the power of PreSeries remains on your side!

PreSeries joins FinTech Sandbox

Originally published in Medium

Building a FinTech startup is like riding a carriage on a dirt road. Sure it’s exciting to follow the path less traveled, but say hello to the bumpiest ride of your life. In this analogy, let’s imagine that PreSeries, our machine-learning platform for startup investors, is a FinTech carriage that needs to find its way through the “data potholes”. With practice, navigating through the uncharted territory of startup data becomes a second nature, but the dream of a road paved with better data remains strong.

The 4 steps of working with startup data!

But why is working with startup data such a challenge? At PreSeries, we are building an automated platform to scout and assess startups from around the globe in few clicks. It goes without saying that startup data is our lifeblood but is … well … scarce, often outdated, expensive to source, and you encounter missing data as often as the word “disrupt” at a tech conference. That’s the nature of working with early-stage private companies, they’re not really open books. But hey, hate the game not the players, right?

This is why we are very happy to announce that PreSeries is joining the FinTech Sandbox program. FinTech Sandbox is a Boston-based nonprofit that drives global FinTech innovation and collaboration. Their 6-month program provides access to data feeds and APIs from industry leading data partners, top quality cloud hosting from infrastructure partners, and much more. FinTech Sandbox is a thriving community of 2,200+ members, 70+ startups, and 40+ partners. We are thrilled to join this growing digital family!

This is an important step for us!

  1. Being part of such an amazing community of FinTech passionate experts makes us really proud. If you are amazed by the team running FinTech Sandbox (jean donnelly, David Jegen, Sarah Biller or Mona M. Vernon to name just some), or the data partners (ThomsonReuters, S&P Global, Dun&Bradstreet or Edgar to name a few), you would also like to check the startup alumni section: Quantopian, CircleUp or Nutonian among others.
  2. Access to new premium data streams will help us increase the quality of our machine learning models. We want to develop the right models and tools so that our users are later on able to access and customize depending on their preferences.
  3. Lastly, we are excited to work with the FinTech Sandbox data partners and explore ways to develop long-standing relationships with them. We are advocates for more data to find and assess startups and are excited to open a whole new market in terms of data consumption with the venture capital community.
The PreSeries Dashboard

Our mission is to build the long-awaited crawling & machine-learning infrastructure needed for better startup scouting and analysis, so startup investors don’t have to! For venture capitalists, our SaaS platform is eliminating the time and cost of building their own machine-learning solution by democratizing access to predictive technologies. We are saving investors an estimated 2 to 5 years of development and between $6 to $10 million a year in development and maintenance cost (infrastructure, data providers, engineers and analysts salaries, etc.).

On a last note, I want to stress the fact that PreSeries is growing and looking for passionate people to join the team. If you want to help us make venture capital a more data-driven practice, fill out our application form! We’re looking for data engineers, data scientists, designers, front-end developers, as well as sales & marketing people. Looking forward to your application!