Throughout the week, I read a lot of blog-posts, articles, etc., that has to do with things that interest me
- data science
- data in general
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This is the “roundup” of the posts that has been most interesting to me, for the week just gone by.
Streaming
- It’s Okay To Store Data In Apache Kafka. An interesting post about using Kafka for data storage.
Cloud
- Events, Data Points, and Messages - Choosing the right Azure messaging service for your data. For a while the messaging infrastructure of Microsoft Azure has consisted of the Azure Service Bus, and Azure EventHub. Recently Microsoft also introduced Azure EventGrid. In this post Clemens Vasters looks at what service to use when.
- Azure Serverless end-to-end with Functions, Logic Apps, and Event Grid. A Channel 9 video, which gives a brief overview of each of the components of the Serverless story in Azure.
Distributed Computing
- Zero to Production-Ready in Minutes. A presentation from InfoQ about how Netflix is enabling engineers to go from “zero” to “production ready” in minutes.
SQL Server
- PowerShell connection to SQL Server: MARS enabled, pooling disabled. Lonny posts about PowerShell, MARS and connection pooling. Cool stuff!!
Data Science
- Microsoft R Open 3.4.1 now available. David at Revolution Analytics posts how Microsoft R Open have been upgraded to R version 3.4.1. Let’s hope that Microsoft R Server will be upgraded soon too.
- NEURAL NETWORKS DEMYSTIFIED 1: Classification Problems. First post in a series attempting to make Neural Networks understandable for people who know nothing more than high school math (e.g. myself). I’ll follow it with interest!
- Simplifying The Use of Azure Data Science Virtual Machine with R. This post talks about AzureDSVM, an R package that makes it possible to directly manage an Azure Data Science Virtual Machine (DSVM).
- How to write distributed TensorFlow code — with an example on TensorPort. TensorFlow is an awesome framework for machine learning, but it is not straightforward to write TensorFlow code in a distributed fashion. This blog-post tries to describe/explain how to how to run distributed TensorFlow.
- The Keys to Effective Data Science Projects – Explore the Data. Buck Woody continues his The Keys to an Effective Data Science Project series. This time he looks at how a data scientist should explore the data he works with. Some really useful tips and comments in there.
SQL Server R Services
- SQL Server R Services: The Basics. I am happy to see that there are more people then me writing about SQL Server R Services. This is the first post in a series about SQL Server R Services, written by Robert Sheldon. Quite a lot of cool stuff in there.
Speaking of SQL Server R Services; I have now finished my speaking engagements (for now), and can start writing about SQL Server R Services again. By the end of September I should have Internals - XI ready to publish. In XI I cover the internal data transfer protocol Binary eXchange Language (BXL). If you are interested, Internals - X is here.
~ Finally
That’s all for this week. I hope you enjoy what I did put together. If you have ideas for what to cover, please comment on this post or ping me.