Data Engineering Special Report

6 years ago

Html
Text

January 2019

Data Engineering Special Report

A NoSQL Database Architecture for Real-Time Applications

Learn about a new kind of NoSQL database architecture that’s simple, cost-effective and that delivers speed at scale for real-time applications. This new architecture delivers predictable performance while using up to ten times fewer servers than most other databases. Learn more.

Face-api.js: JavaScript Face Recognition Leveraging TensorFlow.js

Face-api.js is a JavaScript API for face detection and face recognition in the browser implemented on top of the tensorflow.js core API. It implements a series of convolutional neural networks (CNNs), optimized for the web and for mobile devices.

Google Open-Sources BERT: A Natural Language Processing Training Technique

In a recent blog post, Google announced they have open-sourced BERT, their state-of-the-art training technique for Natural Language Processing (NLP) . Google has decided to do this, in part, due to a lack of public data sets that are available to developers. In addition, optimizations have been made to Cloud TPUs to reduce the amount of time required for training NLP.

Netflix Keystone Real-Time Stream Processing Platform

Netflix recently published a post in their tech blog discussing the design considerations and insights of Keystone, their Real-time stream processing platform. Keystone has been operational since December 2015 and has grown significantly over the years as Netflix subscribers have grown from 65 to over 130 million in the past 3 years. This article follows on the latest state of Keystone platform.

When, Where & Why to Use NoSQL?

Download this white paper and learn the biggest challenges of managing big data, database requirements for dealing with big data, and how NoSQL databases address these challenges. Download Now.

Redis 5.0 Released with New Streams Data Type

Redis recently announced version 5 of its popular database, 15 months after the release of Redis 4. Probably the most important feature of this version is the support for a new data type, Streams. Sorted set functionality has also improved and Redis modules have also been expanded, with the introduction of Clusters and Timers APIs. LOLWUT and other improvements are reviewed in the article.

Concept and Object Modeling Notation for Data Modeling NoSQL Databases

Ted Hills hosted a workshop at the recent Data Architecture Summit 2018 Conference about data modeling for relational and NoSQL databases. He said that the NoSQL movement helped the database community realize two things. First, not every application needs ACID properties. Second, the tabular data organization is still a good choice for much data, although not for all datasets.

Spark Application Performance Monitoring Using Uber JVM Profiler, InfluxDB and Grafana

In this article, author Amit Baghel discusses how to monitor the performance of Apache Spark based applications using Uber JVM Profiler, InfluxDB and Grafana data visualization tool.

Natural Language Processing with Java - Second Edition: Book Review and Interview

Natural Language Processing with Java - Second Edition book covers NLP topic and various tools developers can use in their applications. InfoQ spoke with co-author Richard Reese about the book.

Sentiment Analysis: What's with the Tone?

In this article, authors discuss NLP-based sentiment analysis based on machine learning (ML) and lexicon-based approaches using KNIME data analysis tools.

Analytics Zoo: Unified Analytics + AI Platform for Distributed Tensorflow, and BigDL on Apache Spark

We describe how Analytics Zoo can help real-world users to build end-to-end deep learning pipelines for big data, including unified pipelines for distributed TensorFlow and Keras on Apache Spark.

The Architecture of a Real-Time Operational DBMS

Years ago, Aerospike’s engineering team set out to build a distributed database system that handles real-time workloads smoothly and provides a high level of fault tolerance. Learn how they built a high-performance, distributed database to handle the needs of today’s interactive online services. Learn More.

The New Kid on the Block: Spring Data JDBC

Jens Schauder describes the current state of Spring Data JDBC, its features and some of the underlying design decisions, especially its DDD-based API.

Big Data and Deep Learning: A Tale of Two Systems

Zhenxiao Luo explains how Uber tackles data caching in large-scale DL, detailing Uber’s ML architecture and discussing how Uber uses Big Data, concluding by sharing AI use cases.

Reactive Relational Database Connectivity

Ben Hale discusses the Reactive Relational Database Connectivity (R2DBC), explaining how the API works, the benefits of using it, and how it contrasts with the ADBC proposed as a successor to JDBC.

Implementing AutoML Techniques at Salesforce Scale

Matthew Tovbin shows how to build ML models using AutoML (Salesforce), including techniques for automatic data processing, feature generation, model selection, hyperparameter tuning and evaluation.

You have received this email because you subscribed to "Top Content and Special Reports Newsletter". To stop receiving weekly updates on trends, please click the following link: Unsubscribe
C4Media Inc. (InfoQ.com),
2275 Lake Shore Boulevard West,
Suite #325,
Toronto, Ontario, Canada,
M8V 3Y3

Share on

Other newsletters from Infoq.com

OWASP AI Testing, Crypto-Jacking, C++26, GPULlama3.java, JSON Modules, LinkedIn Scaling, Strategic Influence Infoq.com
Yesterday at 09:03
The Software Architects' Newsletter June 2025 Infoq.com
Last Friday at 09:02
ML at Netflix, Void IDE, Agents Toolkit, TC39, Docs-as-Code, Decentralized Architecture, Staff+ Infoq.com
8 days ago

More newsletters from Infoq.com

Data Engineering Special Report

Data Engineering Special Report

Share on

Other newsletters from Infoq.com

Related newsletters