Throughout the week, I read a lot of blog-posts, articles, and so forth, that has to do with things that interest me:
- data science
- data in general
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This blog-post is the “roundup” of the things that have been most interesting to me, for the week just ending.
SQL Server & Other Database Systems
- SQL Server on Linux: Quick Performance Monitoring. A blog post by Bob Dorr where he talks about how we can get a Performance Monitor like view on Linux for SQL Server.
- MySQL Version 8 Adds Document Store, Performance and Security Improvements. An InfoQ article about new functionality in the latest version of MySQL. What caught my eye was the support for document store. I guess, depending on how it is done, having a combination of Document Store and a relational database can be very powerful.
Streaming
- Apache Flink 1.5.0 Release Announcement. This post announces the release of Flink version 1.5.0. There are quite a few new exciting features, among them: broadcast state, task-local state recovery and support for windowed outer equi-joins in the SQL and Table API’s. I have just set up a new CentOS 7 virtual machine so I can check this out.
- New Confluent Cloud Professional, and Ecosystem Expansion to Google Cloud Platform. I do not know if you knew this - but there is a rule saying that if there is an announcement from Flink, there also has to be an announcement from Kafka (and visa versa). So, the Kafka announcement is that the Confluent Cloud is now also available on the Google Cloud (in addition to AWS). I really hope to see Azure being in the mix in a not too distant future.
.NET
- Discussions on the Future of .NET Core. An [InfoQ] article where five industry veterans discuss the .NET Core platform. Some takeaways:
- The .NET Core platform provides significant performance benefits as compared to the traditional .NET Framework.
- .NET Core benefits from a server-centric design that is not Windows-focused.
- .NET Core is now a stable platform suitable for new application development.
Distributed Computing
- Open Sourcing Zuul 2. An announcement from Netflix, how they are now open-sourcing their cloud gateway: Zuul 2. A very interesting point in the post was where it discussed anomaly detection and contextual alerting.
Data Science
- Data Pipelines for Real-time Fraud Prevention at Scale. An InfoQ presentation about the architecture of PayPal’s data service which combines a Big Data approach with providing data in real time for decision making in fraud detection.
- Pixie: a system for recommending 3+ billion items to 200+ million users in real-time. A white paper dissected by Adrian describing how Pinterest has built a system for recommending 3+ billion items to 200+ million users in real-time!
- Enterprise Deployment Tips for Azure Data Science Virtual Machine (DSVM). I came across this post thanks to Luis and his weekly newsletter. The post he pointed to is about the Azure Data Science Virtual Machine (DSVM) and how to use and deploy in an enterprise environment. This topic is especially interesting right now as we at Derivco is looking at the DSVM. Thanks Luis!
SQL Server Machine Learning Services
In lasts week roundup I mentioned how I probably have to write some follow up posts to the sp_execute_external_script and SQL Compute Context - I post I published a week ago. I have now started with the first follow up post, and I hope I can publish it in a week or two.
~ Finally
That’s all for this week. I hope you enjoy what I did put together. If you have ideas for what to cover, please comment on this post or ping me.