Principal Architect ( Data & AI) Over 22 yrs. of experience in IT. Global Delivery Models.
Friday, August 26, 2022
Thursday, August 25, 2022
Snowflake - Architecture
Snowflake
It is an analytic data warehouse provided as Software-as-a-Service (SaaS). There is no hardware (virtual or physical) to select, install, or configure, there is no software to install, all ongoing maintenance and tunning is handled by Snowflake.
Database Storage
When data is loaded into Snowflake, Snowflake organizes the data into multiple micro partitions that are structured as an internal optimized, compressed, columnar format. Snowflake stores this optimized data in cloud storage. Data is stored in the cloud storage and works as a shared-disk model thereby providing simplicity in data management. This makes sure users do not have to worry about data distribution across multiple nodes in the shared-nothing model. Snowflake manages all aspects of how this data is stored — the organization, file size, structure, compression, metadata, statistics, and other aspects of data storage are handled by Snowflake.
Query Processing
Query execution is performed in the processing (compute) layer. Snowflake is processing queries using “virtual warehouses”. Snowflake separates the query processing layer from the disk storage. Each virtual warehouse is a Massively Parallel Processing (MPP) compute cluster composed of multiple compute nodes allocated by Snowflake from a cloud provider. Each virtual warehouse is an independent compute cluster that does not share compute resources with other virtual warehouses. As a result, each virtual warehouse has no impact on the performance of other virtual warehouses.
Cloud Services
The cloud services layer is a collection of services that coordinate activities across Snowflake. These services tie together all of the different components of Snowflake in order to process user requests, from login to query dispatch. The cloud services layer also runs on compute instances provisioned by Snowflake from the cloud provider.
Among the services in this layer:
Authentication
Infrastructure management
Metadata management
Query parsing and optimization
Access control
Row-based vs Columnar-based storage organization
Snowflake Arch and COmponents
Types of Data
Warehouse Architecture
• Single-Tier Architecture: This type aims to
extract data in order to reduce the amount of data stored.
• Two-tiered Architecture: This type of
architecture aims to separate the actual Data Sources from the Database. This
enables Data Warehouse to expand and support multiple end users.
• Three-tiered Architecture: This type of
architecture has 3 phases in it. The section below contains Data Warehouse
Server Databases, an intermediate section of Online Analytical Processing
(OLAP) Server used to provide a vague view of Websites, and finally, the
advanced section Advanced Client Framework that includes tools and APIs used to
extract data.
The 4
components of the Data Warehouse.
1.
Database Warehouse Database
The
website forms an integral part of the Database. Database stores and provides
access to corporate data. Amazon Redshift and Azure SQL come under cloud-based
Database services.
2. Extraction, Transform, and Load
(ETL) Tools
All
activities associated with the extraction, conversion, and uploading (ETL) of
data in a warehouse fall under this category.
3. Metadata
Metadata
provides the framework and definitions of data, allowing for the creation,
storage, management, and use of data.
4. Database Access Tools
These Warehouse tools include Data Reporting
Tools, Data Inquiry Tools, Application Development Tools, Data Mining Tools,
and OLAP Tools.
Snowflake Features
Features of Snowflake Data Warehouse
1. Data Protection and Protection: Snowflake
data repository provides advanced authentication by providing Multi-Factor
Authentication (MFA), federal authentication and Single Login (SSO) and OAuth.
communication between client and server is secured by TLS.
2. Standard and Extended SQL Support: Snowflake
data repository supports multiple DDL and SQL DML commands. It also supports
advanced DML, transactions, lateral views, saved processes, etc.
3. Connectivity: Snowflake Database supports a
comprehensive set of client and driver connectors such as Python connector,
Spark connector, Node.js driver, .NET driver, etc.
4. Data Sharing: You can securely share data
with other Snowflake accounts.
1. Simple: has a simple and intuitive user interface.
2. Fault-Tolerant: Hevo offers a faulty-tolerant
structure. It can automatically detect what is confusing and alert you
immediately
3. Real-Time: has a
real-time live streaming system, which ensures your data is always ready for
analysis.
4. Schema Map: will
automatically detect the schema from your incoming data and map it to your
destination schema.
5. Data Transformation: Provides a simple
visual interface to complete, edit, and enrich the data you want to transfer.
6. Live Support: The Hevo team is available
around the clock to provide specialized support via chat, email, and support
phone
Friday, August 19, 2022
Thursday, August 18, 2022
AWS Control Tower
Your landing zone is now available.
AWS Control Tower has set up the following:
- 2 organizational units, one for your shared accounts and one for accounts that will be provisioned by your users.
- 3 shared accounts, which are the management account and isolated accounts for log archive and security audit.
- A native cloud directory with preconfigured groups and single sign-on access.
- 20 preventive guardrails to enforce policies and 3 detective guardrails to detect configuration violations.