Bad data costs the US $3 Trillion per year Issue 54: CognitionX Data Science, AI and Machine Learning Briefing


Debates we’ve had this week:

  • Is the new European Machine Intelligence Landscape issued by Product Juno an accurate representation?
  • Should CognitionX host our own Robot Wars? Considering you can now use TensorFlow and $100 to build a robot
  • Will AI eliminate 6% or 60% of jobs?

NOW, you have a place to debate them with us 🙂

The CognitionX Community platform was rolled out this week. Members will have just received login details so they can connect, comment and engage. If you aren’t a member already please run now and register or you’ll miss out on all the fun Forums and more.

Tabitha ‘OverTheMoon’ Goldstaub

Deal of the Day

Apple acquires Tuplejump

Apple’s machine learning company buying spree doesn’t want to stop. Now they acquired an India/US-based company. Tuplejump specialises in processing and analysing big sets of data quickly. Terms of the deal have not been published.


Feed your mind over lunch

One important way to think about your job and how future proof it is

We have had many estimates regarding how many jobs will be lost to machines, 5 million by 2020, 6% of all jobs by 2021. But how do you know YOUR job is at risk? One way is to distinguish between Algorithmic and Heuristic work. Algorithmic work can be done by anyone – or any thing. You perform the same set of tasks, step-by-step, over and over again? Then your work is algorithmic. Heuristic work requires trial and error, cleverness and insight. It’s not guaranteed to get the same result over and over again. This type of work requires human touch.


Stats that Impress

Bad data costs the US $3 Trillion per year

IBM estimates that the yearly cost of poor quality data, in the US alone, in 2016 is $3.1 trillion. Managers, data scientists and decision makers have to accommodate bad data in their work everyday, which is time-consuming and expensive. Some more figures: 60% the estimated fraction of time that data scientists spend cleaning and organising data; 50% – the amount of time wasted in hidden data factors and finding and correcting errors.



Teaching computers to identify odours

Researchers at Harvard University were able to train a computer to recognise the neural patterns associated with various scents, and to identify whether specific odours were present in a mix of smells. The study shows that computer learning algorithms can be powerful tools to examine the capacity of smelling, and a way to design and conduct experiments in a virtual space before conducting them in the real world.


Pure unadulterated research

Google’s image captioning open sourced

2014, the Google Brain team trained a ML system to automatically produce captions that accurately describe images, which won the Microsoft COCO 2015 image captioning challenge. Now the latest version of the system has been open sourced in TensorFlow. The new release is much faster to train and produces more accurate descriptions.


Twitter is a better estimate for poll results than election predictions

According to a paper written by Nick Beauchamp, an assistant professor of political science at Northeastern University, machine learning applied to more than 100 million tweets during the 2012 election cycle closely represents the results of state-level polling.


Another example of a “robots taking human jobs”

Why hire a lawyer when a robot will do?

Virtual lawyers are able to read natural language and understand it, to categorise documents. They enable the sorting of massive piles of legal documents into smaller piles based on their relevance. They are trained by feeding them thousands of documents and contract clauses. Compared to traditional methods using algorithms can lead to an efficiency increase by at least 50 percent and time reduction by 90 percent.


Weekend Watching

A film that makes only sense to developers and engineers

“Netflix released a eerie original film noir this month called Meridian that combines a classic detective tale with bizarre visuals, loud special effects, and creepy imagery. The 12-minute film got 1.5 stars and a few reviews on Netflix. But it wasn’t made for casual viewers. It was released for developers and engineers.” Netflix is giving away the project for free on


Business Impact of AI

Digital laggards must harness data or get left behind

According to research by the Harvard Business School companies that landed in the upper 25 percent have better gross margins, better earning and higher net income, compared to the bottom quarter. The main reason for the difference is that the companies with better gross margins are more effective at putting data to use.


Coding experiments

AI playing Doom

Vizdoom is a competition that asks the question “Can AI effectively play Doom using only raw visual input?”. The competition was held in Santorini, Greece. The winners of the competition were F1 (programmed by Facebook AI researchers) and IntelAct (programmed by Intel Labs), but there were a lot of other interesting bots , created by students.


Copyright © 2016 CognitionX, All rights reserved.


Published in

Leave a reply

Thank you! Your subscription has been confirmed. You'll hear from us soon.

Log in with your credentials


Forgot your details?

Create Account