Data Engineering Special Report | |
|
|
|
In this special newsletter we bring you up to date on all the new content and news related to Data Engineering on InfoQ. We are also maintaining a portal page for this content on InfoQ at: https://www.infoq.com/ai-ml-data-eng. |
|
|
What Machine Learning Can Learn from DevOps (article, Dec 15, 2018) | Microsoft Announces AI-Assisted IntelliCode for TypeScript and JavaScript in VS Code (news, Dec 10, 2018) | TensorSpace.js Delivers Neural Network 3D Visualization Framework (news, Dec 06, 2018) | Amazon Introduces Intelligent-Tiering for S3 Storage to Automatically Optimize Costs (news, Dec 05, 2018) | Azure Machine Learning Services Now Generally Available (news, Dec 05, 2018) |
|
Learn about a new kind of NoSQL database architecture that’s simple, cost-effective and that delivers speed at scale for real-time applications. This new architecture delivers predictable performance while using up to ten times fewer servers than most other databases. Learn more. Sponsored content |
| |
|
Top Viewed Content on InfoQ |
|
Apache Kafka: Ten Best Practices to Optimize Your Deployment (articles, Oct 19, 2018) | Back to the Future with Relational NoSQL (articles, Dec 04, 2018) | The Evolution of Uber's 100+ Petabyte Big Data Platform (news, Nov 10, 2018) | Scaling Apache Kafka at Pinterest (news, Dec 09, 2018) | Amazon Announces Managed Streaming for Kafka in Public Preview (news, Dec 06, 2018) |
|
|
Face-api.js is a JavaScript API for face detection and face recognition in the browser implemented on top of the tensorflow.js core API. It implements a series of convolutional neural networks (CNNs), optimized for the web and for mobile devices. | In a recent blog post, Google announced they have open-sourced BERT, their state-of-the-art training technique for Natural Language Processing (NLP) . Google has decided to do this, in part, due to a lack of public data sets that are available to developers. In addition, optimizations have been made to Cloud TPUs to reduce the amount of time required for training NLP. | Netflix recently published a post in their tech blog discussing the design considerations and insights of Keystone, their Real-time stream processing platform. Keystone has been operational since December 2015 and has grown significantly over the years as Netflix subscribers have grown from 65 to over 130 million in the past 3 years. This article follows on the latest state of Keystone platform. |
|
Download this white paper and learn the biggest challenges of managing big data, database requirements for dealing with big data, and how NoSQL databases address these challenges. Download Now. Sponsored content |
| |
|
Redis recently announced version 5 of its popular database, 15 months after the release of Redis 4. Probably the most important feature of this version is the support for a new data type, Streams. Sorted set functionality has also improved and Redis modules have also been expanded, with the introduction of Clusters and Timers APIs. LOLWUT and other improvements are reviewed in the article. |
|
Ted Hills hosted a workshop at the recent Data Architecture Summit 2018 Conference about data modeling for relational and NoSQL databases. He said that the NoSQL movement helped the database community realize two things. First, not every application needs ACID properties. Second, the tabular data organization is still a good choice for much data, although not for all datasets. |
|
|
In this article, author Amit Baghel discusses how to monitor the performance of Apache Spark based applications using Uber JVM Profiler, InfluxDB and Grafana data visualization tool. |
| |
|
Natural Language Processing with Java - Second Edition book covers NLP topic and various tools developers can use in their applications. InfoQ spoke with co-author Richard Reese about the book. |
| |
|
In this article, authors discuss NLP-based sentiment analysis based on machine learning (ML) and lexicon-based approaches using KNIME data analysis tools. |
| |
|
We describe how Analytics Zoo can help real-world users to build end-to-end deep learning pipelines for big data, including unified pipelines for distributed TensorFlow and Keras on Apache Spark. |
| |
|
Years ago, Aerospike’s engineering team set out to build a distributed database system that handles real-time workloads smoothly and provides a high level of fault tolerance. Learn how they built a high-performance, distributed database to handle the needs of today’s interactive online services. Learn More. Sponsored content |
| |
|
|
Jens Schauder describes the current state of Spring Data JDBC, its features and some of the underlying design decisions, especially its DDD-based API. |
| |
|
Zhenxiao Luo explains how Uber tackles data caching in large-scale DL, detailing Uber’s ML architecture and discussing how Uber uses Big Data, concluding by sharing AI use cases. |
| |
|
Ben Hale discusses the Reactive Relational Database Connectivity (R2DBC), explaining how the API works, the benefits of using it, and how it contrasts with the ADBC proposed as a successor to JDBC. |
| |
|
Matthew Tovbin shows how to build ML models using AutoML (Salesforce), including techniques for automatic data processing, feature generation, model selection, hyperparameter tuning and evaluation. |
| |
|
|