Throughout the week, I read a lot of blog-posts, articles, etc., that has to do with things that interest me
- data science
- data in general
- distributed computing
- SQL Server
- transactions (both db as well as non db)
- and other “stuff”
This is the “roundup” of the posts that has been most interesting to me, this week.
Data Science
- Deep Learning for Sensor Fusion and Sequence Classification. Faisal discusses how the Microsoft Cognitive Toolkit can be used for sequence classification.
- Data Preprocessing vs. Data Wrangling in Machine Learning Projects. Article from InfoQ, which compares different alternative techniques to prepare data for machine learning. Techniques include extract-transform-load (ETL) batch processing, streaming ingestion and data wrangling.
- TensorFlow 1.0 Released. Another article from InfoQ. This article is about the release of Google’s TensorFlow.
- rxNeuralNet vs. xgBoost vs. H2O. In version 9.0.3 of Microsoft R Server, Microsoft has introduced a new package for Microsoft R Server; MicrosoftML. The package brings new machine learning functionality with improvements in speed, performance and scalability. In Tomaz blog-post he puts the new functionality to test.
- Microsoft Data Science Newsletter. If you are interested in what Microsoft is doing in data science you should definitely subscribe to the monthly newsletter.
- Employee Retention with R Based Data Science Accelerator. Cool “stuff from” Revolution Analytics about how to use R to analyze employee retention.
- Announcing R Tools for Visual Studio. R Tools for Visual Studio has been released!
Distributed Computing
- Redundancy does not imply fault tolerance: analysis of distributed storage reactions to single errors and corruptions. Adrian dissects a white-paper pertaining to how distributed storage reacts to errors and corruptions. My conclusion; be afraid, be very afraid!
SQL Server
- Why PFS pages cannot be repaired. Paul Randal from SQLskills fame explains why DBCC CHECKDB cannot repair Page Free Space pages. Very cool “stuff”!
- SQL Server on Linux, will it perform?. This is the slide deck from Slava Oks presentation at QCon in London this year about SQL Server on Linux. Amazing! Cannot wait for the video to be published!
- Context in perspective 1: What the CPU sees in you. Ewald has a series about context in SQL Server, and this is the first post. So, so interesting! As a side note; you should really follow Ewald’s blog if you are interested in various and sundry, deeply technical topics of how SQL Server works under the covers! WinDbg FTW!
That’s all for this week. I hope you enjoy what I did put together. If you have ideas for what to cover, please comment on this post or ping me.