Throughout the week, I read a lot of blog-posts, articles, etc., that has to do with things that interest me
- data science
- data in general
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This is the “roundup” of the posts that has been most interesting to me, this week.
Transaction Systems
- Omid reloaded: scalable and highly-available transaction processing. the morning paper looks at Apache Omid, which is a transactional framework that allowing ACID transactions on top of MVCC key/value NoSQL data-stores.
SQL Server
- Context in perspective 5: Living next door TLS. Ewald continues his series about context in SQL Server.
- SQL Server on Linux: Will it Perform or Not?. In the roundup for week 10, I pointed to Slava Oks slide deck from QCon in London about SQL Server on Linux. This is the video from the presentation.
Streaming
- Implementing The Schema Registry. Interesting article about how Sky Betting & Gaming use the Confluent Schema Registry to ensure that various teams always encode and decode messages using the same schema.
- Queryable State in Apache Flink® 1.2.0: An Overview & Demo. A very interesting post about Apache Flink now allows you to query application state from external applications.
Data Science
- Convolutional neural networks, Part 1. the morning paper dissects some white papers about Convolutional Neural Networks (CNN).
- Is it possible to use RevoScaleR package in Power BI?. Tomaz shows how RevoScaleR can be used from inside Power BI, pretty cool!
- Alteryx integrates with Microsoft R. Revolution Analytics posts about how Alteryx now supports Microsoft R Server as well as SQL Server R Services. Alteryx is a workflow tool combining data preparation, data blending, and analytics – predictive, statistical and spatial. It looks very interesting!
- Running your R code on Azure with mrsdeploy. Another blog-post from Revolution Analytics, this explains how to provision and run an Azure virtual machine (VM), using the mrsdeploy library that comes installed with Microsoft’s R Server.
- Retail Customer Churn Prediction: How-To Guide Now Available. Predicting customer churn is almost the “holy grail” in machine learning. Microsoft has done a lot of research about churn prediction, and have now released their Retail Customer Churn Prediction Solution How-to Guide.
- End-to-End Data Science Walkthrough with Spark 2.0 on Azure HDInsight Hadoop Clusters. Microsoft has published a tutorial how to use pySpark and MLlib for data science on Spark 2.0 clusters.
- Announcing R Tools 1.0 for Visual Studio 2015. More about R Tools for Visual Studio.
That’s all for this week. I hope you enjoy what I did put together. If you have ideas for what to cover, please comment on this post or ping me.