Throughout the week, I read a lot of blog-posts, articles, etc., that has to do with things that interest me:
- data science
- data in general
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This is the “roundup” of the posts that have been most interesting to me, this week.
This week has been somewhat hectic work-wise, so I have not read as much as I have wanted, but this is what I found.
Streaming
- Apache Flink Community Announces 1.2.0 Release. Flink is a high-performing stream processing framework. They have now released version 1.2, which adds really exciting new functionality to the engine.
- Hazelcast Release Jet, Open-Source Stream Processing Engine. While we are on the subject of stream processing engines, this article is about a new stream processing engine that has some new innovative thinking in the way it works.
Distributed Computing
- Virtual Panel: Microservices in Practice. Panel discussion about the state of art of Microservices, and how they are likely to evolve.
SQL Server
- Extreme 25x compression of JSON data using CLUSTERED COLUMNSTORE INDEXES. In last weeks roundup, I pointed out a post by Jovan Popovic about JSON data and Clustered Column Store Indexes. This weeks post drills further into it and shows how you can get really impressive compression of the data.
- Exporting tables from SQL Server in json line-delimited format using BCP.exe. More by Jovan. This time how SQL Server can be used to export content of tables into line-delimited JSON format.
- SQL Server First Responder Kit. Through an article in InfoQ, I came across this very handy tool for anyone that has to do any kind of work with SQL Server.
Data Science
- Build an intelligent app with SQL Server and R. By now it should be pretty clear that SQL Server 2016, has some very impressive capabilities when it comes to Data Science. This post outlines how to get started and building a predictive model, using SQL Server 2016 and R.
- Retail customer analytics with SQL Server R Services. More about SQL Server and R. This time about analytics of retail customers.
- Machine Learning Your Way to Smarter API Error Responses. Presentation about how Machine Learning can be used to help you understand malformed API requests and to be able to respond with a best fit response, as well as capturing the user errors for future responses.
- Machine Learning and End-to-End Data Analysis Processes in Spark Using Python and R. Presentation by Debraj GuhaThakurta from the Microsoft Azure Machine Learning group, where he talks about machine learning and data analysis processes in Spark using Python and R.
Big Data and Data Lakes
- Load Data from Azure Data Lake into Azure SQL Data Warehouse at 3TB/Hour. Post about how to use SQL Server Data Warehouse PolyBase support to load data from Azure Data Lake Storage into SQL Server Data Warehouse.
Shameless Self Promotion
So this is my shameless self promotion part, where I point out posts I have written etc.
- RabbitMQ - SQL Server. Post about how to send data from SQL Server to RabbitMQ.
- satRday - Cape Town. This is the second satRday conference ever - worldwide! My talk is about Microsoft R Server, and how it compares to CRAN R. I do believe there are still available seats, so come by and say Hi!
That’s all for this week. I hope you enjoy what I did put together. If you have ideas for what to cover, please comment on this post or ping me.