Throughout the week, I read a lot of blog-posts, articles, and so forth, that has to do with things that interest me:
- data science
- data in general
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This blog-post is the “roundup” of the things that have been most interesting to me, for the week just ending.
.NET
- Using .NET and Docker Together – DockerCon 2018 Update. What the title says; how to use .NET and Docker together.
- Tools for Exploring .NET Internals. An excellent write up by Matthew about tools you can use to better understand what goes on in the CLR.
Big Data / Cloud
- Control Azure Data Lake costs using Log Analytics to create service alerts. This blog post talks about Azure Data Lake Store (ADLS) and Azure Log Analytics (ALA) and how you can use ALA to control ADLS costs. Very informative!
- Metacat: Making Big Data Discoverable and Meaningful at Netflix. This post is about Netflix Metacat system. A system that acts as a federated metadata access layer for all data stores Netflix has.
Distributed Computing
- Medea: scheduling of long running applications in shared production clusters. Adrian dissects a white paper about Medea. Medea is designed to support the use case of mixed long running applications and shorter duration tasks within the same cluster.
Streaming
- Acessing Event Hubs with Confluent Kafka Library. Last month - some time - Microsoft announced EventHubs support for the Kafka protocol. What Kafka protocol support means is that you can use the Kafka client libraries to ingest data into an EventHub, sweet! This blog post shows an example of how to do that.
- Deploying Kafka on Kubernetes with Local Persistent Volumes using Strimzi. So Strimzi is a system which provides the ability to run Kafka clusters on OpenShift and Kubernetes. The blog post discusses various Kafka storage options when running Kafka this way.
- Democratizing Stream Processing with Apache Kafka and KSQL. An article which discusses stream processing with KSQL, the streaming SQL engine for Apache Kafka, and how KSQL helps to bridge the world of streams and databases through streams and tables.
Data Science
- Advanced Technologies for Detecting and Preventing Fraud at Uber. A blog post discussing how Uber leverages cutting-edge systems to tackle fraud on their platform.
SQL Server Machine Learning Services
At the moment I have two posts about SQL Server Machine Learning Services “on the go”. The first is the never-ending follow-up post to my sp_execute_external_script and SQL Compute Context - I post from four weeks ago. I can not seem to get that one done.
The second is a post about the options you have if you want to install R packages into SQL Server Machine Learning Services. For that one I have good hopes to be able to publish this coming week sometime.
~ Finally
That’s all for this week. I hope you enjoy what I did put together. If you have ideas for what to cover, please comment on this post or ping me.