run adaptive. European government centralizes data with the help of Azure to deliver integrated services to its citizens with new developed platform.

European government centralizes data with the help of Azure to deliver integrated services to its citizens with new developed platform.

European government finds a solution to centralize data from all public entities across the country into one big platform by gathering and manipulating data through various processes, ensuring national data synchronization.

Business needs

Developing a new multi-platform which will integrate all the needs of each institution around the country proved essential to the government’s effort to realize digital transformation.

As most institutions were acting isolated, each having its own applications and own databases, communication and integration between different institution services seemed nearly impossible. Previously, each citizen had to register to each service and have different credentials/accounts, his data being stored independently on each service.

The multi-platform which will be developed to serve all the citizens' needs in a one-integrated system needs a centralized data solution from which it can fetch relevant information.

As most of the independent databases were having different structures and different schemas, it needed a solution that will gather all the information from all the databases and structure it so that the new centralized data lake will contain relevant information that will be valuable to the new multi-platform.

Solutions implemented

1) Azure Data Factory

In the world of big data, raw, unorganized data are often stored in relational, non-relational, and other storage systems. However, on its own, raw data doesn't have the proper context or meaning to provide meaningful insights to analysts, data scientists, or business decision makers.

Data was structured in different ways. It needed a solution that will extract the data and integrate it in a centralized way and to be able to orchestrate and operationalize processes to refine these enormous stores of raw data into actionable business insights.

Azure Data Factory provided a data integration and transformation layer that worked across the organization’s digital transformation initiatives by executing complex hybrid extract-transform-load (ETL), extract-load-transform (ELT), and data integration projects.

With the help of Data Factory, the following tasks were achieved:

• Enable data engineers to drive business and IT led Analytics/BI.

• Prepare data, construct ETL, ELT processes and orchestrate and monitor pipelines code-free.

• Transform faster with intelligent intent-driven mapping that automates copy activities.

Ingesting data from diverse and multiple sources can be expensive, time consuming and require multiple solutions. The built-in connectors that Azure Data Factory offers proven essential. Built-in connectors to acquire data from enterprise data warehouses like Oracle Exadata, Teradata or SaaS apps like Salesforce or ServiceNow were a huge help in speeding up the process and executing it faster and error-free.

The service proved essential, as the ETL services helped realize:

• Data ingestion

• Control flow

• Data flow

• Pipeline scheduling

• Pipeline monitoring

2) Azure Data Lake Storage Gen2

A solution to easily manage massive amounts of data and that also offers hierarchical namespaces was needed. Azure Data Lake Storage Gen2 proved to be the solution, as it builds on Blob Storage and enhances performance, management and security in the following ways:

• Performance is optimized because you do not need to copy or transform data as a prerequisite for analysis. Compared to the flat namespace on Blob storage, the hierarchical namespace greatly improves the performance of directory management operations, which improves overall job performance.

• Management is easier because you can organize and manipulate files through directories and subdirectories.

• Security is enforceable because you can define POSIX permissions on directories or individual files.

Scalability was also a key feature, as the data size that needed to be managed was huge. It is able to store and serve many exabytes of data. This amount of storage is available with throughput measured in gigabits per second (Gbps) at high levels of input/output operations per second (IOPS). Beyond just persistence, processing is executed at near-constant per-request latencies that are measured at the service, account, and file levels.

One of the many benefits of storing raw data in Data Lake Storage Gen2 is the low cost of storage capacity and transactions. Unlike other cloud storage services, data stored in Data Lake Storage Gen2 is not required to be moved or transformed prior to performing analysis.

The key features that Data Lake Storage Gen2 provides and proved to be essential are:

• Hadoop compatible access: Data Lake Storage Gen2 allows you to manage and access data just as you would with a Hadoop Distributed File System (HDFS).

• A superset of POSIX permissions: The security model for Data Lake Gen2 supports ACL and POSIX permissions along with some extra granularity specific to Data Lake Storage Gen2.

• Cost effective: Data Lake Storage Gen2 offers low-cost storage capacity and transactions.

• Optimized driver: The ABFS driver is optimized specifically for big data analytics.

Benefits

With the data being now centralized and organized, the multi-platform can be now developed and all the needs of each institution around the country will be integrated, proving essential for delivering centralized services.

Now institutions can pull data from the centralized data lake and gain valuable information about each registered citizen in real time, thus delivering public services in a fast and secure manner.

The normal user, the citizen, with the help of the new platform will be able to have just one account for all institutions, making his relation to all public services seamless. He will be able to access services through the multi-platform, making actions like paying taxes or requesting official documents easy, with just a few clicks.

Azure Data Factory proved to be a simple, fast and secure way to migrate, structure and manipulate data. This way the organization was able to choose how the process will work, which data will be selected and stored into the new centralized platform and also create new valuable data and data insights. With the help of Azure Data Lake Storage Gen2, storing these massive amounts of data was done seamlessly, having also the possibility to create hierarchical namespaces.

SUCCESS STORIES

European government centralizes data with the help of Azure to deliver integrated services to its citizens with new developed platform.

European government finds a solution to centralize data from all public entities across the country into one big platform by gathering and manipulating data through various processes, ensuring national data synchronization.

Solutions implemented

adaptive.run

Contact

Legal and Privacy