00:00:17.940
Hello, everyone. It's time for my talk. Today, I will discuss Norikra, an open-source server software for processing data streams using SQL.
00:00:22.680
Norikra is written in JRuby and runs on the JVM, allowing for easy addition and management of stream queries without the need for extensive editors, compilers, or deployments.
00:00:26.619
I will cover several topics today, including the implementation of Norikra and its applications at LINE Corporation. First, it’s important to understand what Norikra is and how it works.
00:00:41.260
My name is Satoshi Tagomori, and I am based in Tokyo, Japan. I work at LINE Corporation, an internet service company that provides a messaging application similar to WhatsApp or Facebook Messenger.
00:01:11.290
LINE has about 130 million users worldwide, primarily in Asia and South America, with some presence in Europe. Besides messaging, we also host many sub-services on our platform, such as Japanese manga, e-publishing, Q&A services, news, and games.
00:01:36.790
As a result, we handle a huge volume of logs and metrics. Our data analytics platform, while simple, is vital for monitoring and analytics. We need to collect and cleanse data from many servers and store it in distributed storage solutions like Hadoop's HDFS.
00:02:22.360
Once data is stored, we process it and visualize the results in various formats, such as graphs and charts. I am not currently a committer for the project, as I was involved in a project called Kyoto Tycoon related to friendly logging management.
00:02:50.040
Roughly speaking, friendly is a log management system that aggregates logs into a remote storage system. We use it alongside Hadoop to ensure we maintain a seamless data processing pipeline.
00:03:16.120
This is crucial as we need real-time processing of web service traffic to quickly identify any issues or changes in our service. This includes monitoring HTTP response codes, request rates per second, and response times.
00:03:31.660
Currently, we are generating these graphics using several tools, which have various plugins that allow us to customize our streaming data-processing capabilities.
00:04:02.920
While friendly offers simple data processing and visualization, it becomes cumbersome for complex scenarios. We often need to adjust configurations, which requires a restart and can interrupt processes.
00:04:35.000
Friendly is not ideal for managing complex datasets or environments subject to frequent schema changes. We need tools that allow for a dynamic adjustment of processing query structures.
00:05:02.240
Application engineers may not be software engineers, yet they understand what metrics are important for our services. Therefore, we need a system that enables these stakeholders to construct queries themselves.
00:05:38.020
This is where Norikra can help. Norikra is a stream processing middleware that enables processing using SQL, allowing for flexibility and responsiveness in adjusting to business needs.
00:06:03.079
It is distributed and can be easily installed via the Ruby gem system. Moreover, it launches servers that can be controlled through various client interfaces and a web UI.
00:06:22.500
Let me show you a demo of how to set up and use Norikra. First, you can install it using the command line, and once installed, you can use the JSON interface to feed data into it.
00:06:43.580
Norikra can process JSON objects with specified fields like name and quantity. We can select specific fields from the event streams with simple SQL queries, which allows us to efficiently visualize the results in the console.
00:07:20.890
This flexibility is significant. If we change the input data schema, Norikra can still handle the new input effectively without necessitating significant adjustments in the query structure.
00:08:04.360
For example, if we receive additional fields in our input, we can dynamically manage these changes within Norikra. This adaptability simplifies managing multiple input schemas.
00:08:47.640
Moreover, we can aggregate data using various SQL commands. Norikra supports counting and summing operations, making it straightforward to extract useful insights from real-time data.
00:09:16.340
In addition, we can customize our queries and push the results directly to different output formats, creating a flexible environment for data processing.
00:09:46.620
The tools we use with Norikra allow for varied and complex data manipulation while ensuring that output can be delivered quickly and effectively.
00:10:25.439
In our production environment, we utilize Norikra to summarize error logs. When messages are sent via our API, they can be monitored and errors are aggregated to prevent flooding our partners with too many error messages.
00:10:49.279
This summarization helps us manage our interactions with partners effectively while ensuring critical issues are highlighted through various means, such as email notifications.
00:11:30.290
Our partnership relies on timely responses to issues, and Norikra's data handling capabilities have allowed us to streamline this process.
00:12:11.560
We’re also using Hadoop to analyze larger datasets concurrently, creating reports and insights across services while maintaining high performance.
00:12:54.150
These reports are generated on a scheduled basis, providing visibility into key metrics and performance measures essential for our team's decision-making processes.
00:13:34.120
In summary, Norikra is a versatile tool for managing real-time data streams effectively. It supports the rapid development of analytics solutions tailored to specific business requirements.
00:14:04.509
The architecture we've developed using Norikra complements other platforms like Google BigQuery, allowing us to analyze and visualize data efficiently.
00:14:51.310
Moreover, the integration of processing tools with a focus on SQL capabilities provides a robust framework for both stream and batch data analytics, aligning with current industry practices.
00:15:33.020
To sum up, if you are interested in Norikra or stream processing, I highly encourage you to check the documentation available on GitHub and give Norikra a try. It can significantly enhance your data processing capabilities.