Throughout the week, I read a lot of blog-posts, articles, etc., that has to do with things that interest me
- data science
- data in general
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This is the “roundup” of the posts that has been most interesting to me, this week.
SQL Server
- Choosing a Primary Key. Another post in Sean Cremer’s series about database design. Full disclosure, he is a colleague of mine - but still a very good guy :). His SQL knowledge is immense!
- SQL Server 2016 Developer Edition in Windows Containers. Announcement and introduction to the availability of SQL Server 2016 Developer Edition in Windows containers. This is a “biggie” for me who often want to spin up a new SQL Server instance. Now I can just have a container with the instance on it and spin it up!
- SQLskills SQL101: Stored Procedures. First in a series of “back to the basics” by Kimberly and Paul. This covers my absolutely favorite feature in SQL Server: Stored Procedures! All you who are saying they are no good - I have a word for you: Heathens! :)
- Architecting SQL Server on Linux: Slava Oks on Drawbridge, LibOS, & Addressing Between Windows/Linux. A podcast with Slava Oks, where Slava talks the implementation of SQL Server on Linux.
Streaming
- Beam Graduates to Top-Level Apache Project. Beam is an Apache project seeking to create a unified programming model for streaming and batch processing jobs, and to produce artifacts that can be consumed by a number of supported data processing engines.
- Fundamentals of Stream Processing with Apache Beam. More about Beam. This is a presentation about Beam’s out-of-order stream processing as well as Beam tries to simplify complex tasks.
- Kafka Summit New York. If you are doing streaming, then you most likely are interested in or, at least, have heard about Kafka. The yearly Kafka conference are coming up, so go ahead and register.
Data Science
- Data Science in the Cloud @StitchFix. A conference presentation about how the cloud enables over 80 data scientists to be productive at StichFix.
- Elastic Data Analytics Platform @Datadog. Conference presentation about DataDog’s cloud-based analytics platform and how it differs from a traditional datacenter-based analytics stack.
- R Tools for Visual Studio. R Tools for Visual Studio are getting closer and closer to a version 1.0 release.
- Prophet - Forecasting at Scale. Prophet is an open-source package for R and Python that implements the time-series methodology that Facebook uses in production for forecasting at scale. Looks very, very interesting.
- Microsoft R Server. I have to do some shameless self-promotion :). This is a blogpost by me comparing how CRAN R handles large datasets compared to Microsoft R Server.
That’s all for this week. I hope you enjoy what I did put together. If you have ideas for what to cover, please comment on this post or ping me.