April 2017

Big Data and Machine Learning Special Report

MongoDB Atlas - Database as a Service.

The best way to run MongoDB in the cloud. Start for free.

Intel Open-Sources BigDL, Distributed Deep Learning Library for Apache Spark

Intel open-sources BigDL, a distributed deep learning library that runs on Apache Spark. It leverages existing Spark clusters to run deep learning computations and simplifies the data loading from big datasets stored in Hadoop.

Netflix Demonstrates Big Data Analytics Infrastructure

At QCon San Francisco, engineers at Netflix discussed their big data strategy and analytics infrastructure. This included a summary of the scale of their data, their S3 data warehouse, and Genie, their big data federated orchestration system.

MongoDB Atlas - Database as a Service.

Create a fully elastic MongoDB cluster in minutes. Start for free.

Zero-Shot Translation with Google Neural Machine Translation System

Google’s Multilingual Neural Machine Translation System creates an interlingua and translates between language pairs and phrases with no previous direct translation available, dubbed Zero-Shot translation.

TensorFlow 1.0 Released

Google recently announced TensorFlow version 1.0. Python API is now stable and experimental APIs for Java and Go have been added. XLA delivers significant performance increase. Keras can also be integrated with TensorFlow using a build-in module. tf.transform, tf.layers, tf.metrics, and tf.losses all add new features to the framework.

Beam Graduates to Top-Level Apache Project

Beam exits incubation period and graduates to top-level Apache project, Google support and contribution to open source integration for various data processing backends and more.

Article Series: An Introduction to Machine Learning

In this series, we give an introduction to some powerful but generally applicable techniques in machine learning.

Data Preprocessing vs. Data Wrangling in Machine Learning Projects

This article compares different alternative techniques to prepare data, including extract-transform-load (ETL) batch processing, streaming ingestion and data wrangling.

Cassandra: The Definitive Guide, 2nd Edition Book Review and Interview

Cassandra: The Definitive Guide, 2nd Edition book authored by Jeff Carpenter and Eben Hewitt covers the Cassandra NoSQL database version 3.0. InfoQ spoke with the co-author Jeff Carpenter.

Big Data Processing Using Apache Spark - Part 6: Graph Data Analytics with Spark GraphX

In this article, author discusses Apache Spark GraphX used for graph data processing and analytics, with sample code for graph algorithms like PageRank, Connected Components and Triangle Counting.

Practicing Machine Learning with Optimism

This article addresses a few examples of issues when using machine learning to solve real-world problems and hopefully provides some suggestions (and inspiration) for how to overcome the challenges.

MongoDB Atlas - Database as a Service.

On-Demand MongoDB. Fast, Easy, & Secure. Start for free.

Reactive Kafka

Rajini Sivaram talks about Kafka and reactive streams and then explores the development of a reactive streams interface for Kafka and the use of this interface for building robust applications.

Scaling Quality on Quora Using Machine Learning

Chun-Ho Hung and Nikhil Garg discuss Quanta, Quora's counting system powering their high-volume near-real-time analytics, describing the architecture, design goals, constraints, and choices made.

Petabytes Scale Analytics Infrastructure @Netflix

Tom Gianos and Dan Weeks discuss Netflix' overall big data platform architecture, focusing on Storage and Orchestration, and how they use Parquet on AWS S3 as their data warehouse storage layer.

ScyllaDB: Achieving No-Compromise Performance

Avi Kivity discusses ScyllaDB, the many necessary design decisions, from the programming language and programming model through low-level details and up to the advanced cache design, and more.

Big Data in the Real World: Technology and Use Cases

Mike Olson presents several use cases where big data is collected and analyzed to gather insights from the automotive, insurance, financial, and other sectors.

If you no longer wish to receive these emails, please click on the following link: Unsubscribe
Forwarded newsletter? Register to get your own weekly newsletter.

C4Media Inc. (InfoQ.com),
2275 Lake Shore Boulevard West,
Suite #325,
Toronto, Ontario, Canada,
M8V 3Y3