By using new ParquetDirect technology, we are making interactive queries over the data lake a reality (in preview). It’s designed to access Parquet files with native support directly built into the engine. Through improved data scan rates, intelligent data caching and columnstore batch processing, we’ve improved Polybase execution by over 13x.
The Azure Data Lake Analytics U-SQL runtime is upgrading from .NET Framework v4.5.2 to .NET Framework v4.7.2. This change has a small risk of introducing breaking changes if you use U-SQL custom assemblies that include .NET libraries with changed the behavior. Please continue here Troubleshoot the U-SQL job failures because of .NET 4.7.2 upgrade to learn more.
Azure Data Lake Storage (ADLS) Gen2 can now publish events to Azure Event Grid to be processed by subscribers such as WebHooks, Azure Event Hubs, Azure Functions, and Logic Apps. With this capability, individual changes to files and directories in ADLS Gen2 can automatically be captured and made available to data engineers for creating rich big data analytics platforms that use event-driven architectures.
Azure Event Grid provides reliable event delivery at massive scale. Event Grid integration brings change notifications for Azure Data Lake Storage Gen2. Individual changes to files and folders are automatically captured and made available to data engineers for the creation of Big Data Analytics platforms that can use Lambda architectures.
If you use Azure Stream Analytics, you’ll now be able to run your real-time pipelines with MSI based authentication while writing to Azure Data Lake Storage Gen 1. This will make credential management much simpler and help users provision long-running jobs as they no longer need to deal with password expiry every so often.
Users of Azure Stream Analytics can now run their real-time pipelines with authentication based on Managed Service Identity while writing to Azure Data Lake Storage Gen 1. This will help users provision long-running jobs, because they no longer need to deal with password expiration.
Today we are sharing an update to the Azure HDInsight integration with Azure Data Lake Storage Gen 2. This integration will enable HDInsight customers to drive analytics from the data stored in Azure Data Lake Storage Gen 2 using popular open source frameworks such as Apache Spark, Hive, MapReduce, Kafka, Storm, and HBase in a secure manner.
This week at Microsoft Ignite 2018, we are excited to announce eight new features in Azure Stream Analytics (ASA). These new features include
Support for query extensibility with C# custom code in ASA jobs running on Azure IoT Edge.
Custom de-serializers in ASA jobs running on Azure IoT Edge.
Live data Testing in Visual Studio.
High throughput output to SQL.
ML based Anomaly Detection on IoT Edge.
Managed Identities for Azure Resources (formerly MSI) based authentication for egress to Azure Data Lake Storage Gen 1.
Blob output partitioning by custom date/time formats.
User defined custom re-partition count.