Tuesday, June 23, 2015

Pattern : Design for Failure

There are two basic categories of failures, which you’ll want to handle differently:
  • Transient, self-healing failures such as intermittent network connectivity issues.
  • Enduring failures that require intervention.
For transient failures, you can implement a retry policy to ensure that most of the time the app recovers quickly and automatically. Your customers might notice slightly longer response time, but otherwise they won’t be affected.

For enduring failures, you can implement monitoring and logging functionality that notifies you promptly when issues arise and that facilitates root-cause analysis.

Machine failures is fairly common in the cloud world. How do we get machine resiliency?







FAILURE SCOPE

You also have to think about failure scope—whether a single machine is affected, a whole service such as SQL Database or Storage, or an entire region.
 

Machine failures

In Azure, a failed server is automatically replaced by a new one, and a well-designed cloud app recovers from this kind of failure automatically and quickly. Earlier, we stressed the scalability benefits of a stateless web tier, and ease of recovery from a failed server is another benefit of statelessness. Ease of recovery is also one of the benefits of platform-as-a-service (PaaS) features such as SQL Database and Web Apps. Hardware failures are rare, but when they occur, these services handle them automatically; you don’t even have to write code to handle machine failures when you’re using one of these services.

Service failures

Cloud apps typically use multiple services. For example, the Fix It app uses the SQL Database service and the Storage service, and it’s deployed to the Web Apps service. What will your app do if one of the services you depend on fails? For some service failures a friendly “Sorry, try again later” message might be the best you can do. But in many scenarios you can do better. For example, when your back-end data store is down, you can accept user input, display “Your request has been received,” and store the input someplace else temporarily. Then, when the service you need is operational again, you can retrieve the input and process it.
Chapter 13, “Queue-centric work pattern,” shows one way to handle this scenario. The Fix It app stores tasks in SQL Database, but it doesn’t have to quit working when SQL Database is down. In that chapter you'll see how to store user input for a task in a queue and use a worker process to read the queue and update the task. If SQL Database is down, the ability to create Fix It tasks is unaffected; the worker process can wait and process new tasks when SQL Database is available.

Region failures

Entire regions may fail. A natural disaster might destroy a data center—it might be flattened by a meteor, the trunk line into the datacenter could be cut by a farmer burying a cow with a backhoe, etc. If your app is hosted in the stricken data center, what do you do? It’s possible to set up your app in Azure to run in multiple regions simultaneously so that if a disaster occurs in one, your app continues running in another region. Such failures are extremely rare occurrences, and most apps don’t jump through the hoops necessary to ensure uninterrupted service through failures of this sort. See the Resources section at the end of the chapter for information about how to keep your app available even through a region failure.
A goal of Azure is to make handling these kinds of failures a lot easier, and you’ll see some examples of how Azure does that in the following chapters.

Sunday, May 24, 2015

Blob Storage


CHOOSING A DATA STORAGE OPTION

CHOOSING A DATA STORAGE OPTION

No one approach is right for all scenarios. If anyone says that a particular technology is the answer, the first thing to ask is "What is the question?" because different solutions are optimized for different things. The relational model has definite advantages; that’s why it’s been around for so long. But there  are also downsides to SQL that can be addressed with a NoSQL solution.
Often, what we see work best is a composite approach in which SQL and NoSQL are used in a single solution. Even when people say they’re embracing NoSQL, a closer looks reveals that they’re using several different NoSQL frameworks—they’re using CouchDBRedis, and Riak for different things. Even Facebook, which uses NoSQL solutions extensively, uses different NoSQL frameworks for different parts of the service. The flexibility to mix and match data storage approaches is one of the qualities that’s nice about the cloud; it’s easy to use multiple data solutions and integrate them in a single app.
Here are some questions to think about when you’re choosing an approach:
Data semanticWhat is the core data storage and data access semantic (are you storing relational or unstructured data)? Unstructured data such as media files fits best in Blob storage; a collection of related data such as products, inventories, suppliers, customer orders, etc., fits best in a relational database.
Query supportHow easy is it to query the data? What types of questions can be efficiently asked? Key/value data stores are very good at getting a single row when given a key value, but they are not so good for complex queries. For a user-profile data store in which you are always getting the data for one particular user, a key/value data store could work well. For a product catalog from which you want to get different groupings based on various product attributes, a relational database might work better. NoSQL databases can store large volumes of data efficiently, but you have to structure the database around how the app queries the data, and this makes ad hoc queries harder to do. With a relational database, you can build almost any kind of query.
Functional projectionCan questions, aggregations, and so on be executed on the server? If you run SELECT COUNT(*) from a table in SQL, the DBMS will very efficiently do all the work on the server and return the number you’re looking for. If you want the same calculation from a NoSQL data store that doesn't support aggregation, this operation is an inefficient “unbounded query” and will probably time out. Even if the query succeeds, you have to retrieve all the data from the server and bring it to the client and count the rows on the client.  What languages or types of expressions can be used? With a relational database, you can use SQL. With some NoSQL databases, such as Azure Table storage, you’ll be using [OData](http://www.odata.org/)[,](http://www.odata.org/) and all you can do is filter on the primary key and get projections (select a subset of the available fields).
Ease of scalabilityHow often and how much will the data need to scale? Does the platform natively implement scale-out? How easy is it to add or remove capacity (size and throughput)? Relational databases and tables aren’t automatically partitioned to make them scalable, so they are difficult to scale beyond certain limitations. NoSQL data stores such as Azure Table storage inherently partition everything, and there is almost no limit to adding partitions. You can readily scale Table storage up to 200 terabytes, but the maximum database size for Azure SQL Database is 500 gigabytes. You can scale relational data by partitioning it into multiple databases, but setting up an application to support that model involves a lot of programming work.
Instrumentation and ManageabilityHow easy is the platform to instrument, monitor, and manage? You need to remain informed about the health and performance of your data store, so you need to know up front what metrics a platform gives you for free and what you have to develop yourself.
OperationsHow easy is the platform to deploy and run on Azure? PaaS? IaaS? Linux? Azure Table storage and Azure SQL Database are easy to set up on Azure. Platforms that aren’t built-in Azure PaaS solutions require more effort.
API SupportIs an API available that makes it easy to work with the platform? The Azure Table Service has an SDK with a .NET API that supports the .NET 4.5 asynchronous programming model. If you're writing a .NET app, the work to write and test the code will be much easier for the Azure Table Service than for a key/value column data store platform that has no API or a less comprehensive one.
Transactional integrity and data consistencyIs it critical that the platform support transactions to guarantee data consistency? For keeping track of bulk emails sent, performance and low data-storage cost might be more important than automatic support for transactions or referential integrity in the data platform, making the Azure Table Service a good choice. For tracking bank account balances or purchase orders, a relational database platform that provides strong transactional guarantees would be a better choice.
Business continuityHow easy are backup, restore, and disaster recovery? Sooner or later production data will become corrupted and you’ll need an undo function. Relational databases often have more fine-grained restore capabilities, such as the ability to restore to a point in time. Understanding what restore features are available in each platform you’re considering is an important factor to consider.
CostIf more than one platform can support your data workload, how do they compare in cost? For example, if you use ASP.NET Identity, you can store user profile data in Azure Table Service or Azure SQL Database. If you don't need the rich querying facilities of SQL Database, you might choose Azure Table storage in part because it costs much less for a given amount of storage.
Microsoft generally recommends that you should know the answer to the questions in each of these categories before you choose your data storage solutions.
In addition, your workload might have specific requirements that some platforms can support better than others. For example:
  • Does your application require audit capabilities?
  • What are your data longevity requirements—do you require automated archival or purging capabilities?
  • Do you have specialized security needs? For example, your data might include personally identifiable information (PII), but you have to be sure that PII is excluded from query results.
  • If you have some data that can't be stored in the cloud for regulatory or technological reasons, you might need a cloud data storage platform that facilitates integration with your on-premises storage.

Saturday, May 23, 2015

Pattern : Data Storage


Azure Storage Resources








DATA STORAGE OPTIONS ON AZURE

The cloud makes it relatively easy to use a variety of relational and NoSQL data stores. Here are some of the data storage platforms that you can use in Azure.
The illustration shows four types of NoSQL databases:
  • Key/value databases store a single serialized object for each key value. They’re good for storing large volumes of data in situations where you want to get one item for a given key value and you don’t have to query based on other properties of the item.
  • Azure Blob storage is a key/value database that functions like file storage in the cloud, with key values that correspond to folder and file names. You retrieve a file by its folder and file name, not by searching for values in the file contents.
  • Azure Table storage is also a key/value database. Each value is called an entity (similar to a row, identified by a partition key and row key) and contains multiple properties (similar to columns, but not all entities in a table have to share the same columns). Querying on columns other than the key is extremely inefficient and should be avoided. For example, you can store user profile data, with one partition storing information about a single user. You could store data such as user name, password hash, birth date, and so forth, in separate properties of one entity or in separate entities in the same partition. But you wouldn't want to query for all users with a given range of birth dates, and you can't execute a join query between your profile table and another table. Table storage is more scalable and less expensive than a relational database, but it doesn't enable complex queries or joins.
  • Document databases are key/value databases in which the values are documents. "Document" here isn't used in the sense of a Word or an Excel document but means a collection of named fields and values, any of which could be a child document. For example, in an order history table, an order document might have order number, order date, and customer fields, and the customer field might have name and address fields. The database encodes field data in a format such as XML, YAML, JSON, or BSON, or it can use plain text. One feature that sets document databases apart from other key/value databases is the capability they provide to query on nonkey fields and define secondary indexes, which makes querying more efficient. This capability makes a document database more suitable for applications that need to retrieve data on the basis of criteria more complex than the value of the document key. For example, in a sales order history document database, you could query on various fields, such as product ID, customer ID, customer name, and so forth.
  • Azure DocumentDB is a NoSQL document database service designed for modern mobile and web applications. DocumentDB delivers consistently fast reads and writes, schema flexibility, and the ability to easily scale a database up and down on demand. DocumentDB enables complex ad hoc queries using a SQL language, supports well defined consistency levels, and offers JavaScript language integrated, multi-document transaction processing using the familiar programming model of stored procedures, triggers, and UDFs.
  • Column-family databases are key/value data stores that enable you to structure data storage into collections of related columns called column families. For example, a census database might have one group of columns for a person's name (first, middle, last), one group for the person's address, and one group for the person's profile information (date of birth, gender, and so on). The database can then store each column family in a separate partition while keeping all of the data for one person related to the same key. You can then read all profile information without having to read through all of the name and address information as well. Cassandra is a popular column-family database.
  • Graph databases store information as a collection of objects and relationships. The purpose of a graph database is to enable an application to efficiently perform queries that traverse the network of objects and the relationships between them. For example, the objects might be employees in a human resources database, and you might want to facilitate queries such as "find all employees who directly or indirectly work for Scott." Neo4j is a popular graph database.
Compared with relational databases, the NoSQL options offer far greater scalability and are more cost effective for storage and analysis of unstructured data. The tradeoff is that they don't provide the rich querying and robust data integrity capabilities of relational databases. NoSQL options would work well for IIS log data, which involves high volume with no need for join queries. NoSQL options would not work so well for banking transactions, which require absolute data integrity and involve many relationships to other account-related data.
A newer category of database platforms, called NewSQL, combines the scalability of a NoSQL database with the querying capability and transactional integrity of a relational database.
NewSQL databases are designed for distributed storage and query processing, which are often hard to implement in "OldSQL" databases. NuoDB is an example of a NewSQL database that can be used on Azure.

Sunday, May 17, 2015

Azure AD Integration

What's great for enterprise single sign-on, though, is the Directory Integration tab:
If you enable directory integration and download a tool, you can sync this cloud directory with your existing on-premises Active Directory that you're already using inside your organization. Then, all of the users stored in your directory will show up in this cloud directory. Your cloud apps can now authenticate all of your employees using their existing Active Directory credentials. And all this is free -- both the sync tool and Azure AD itself.
The tool is a wizard that is easy to use, as you can see from the following screen shots. These are not complete instructions, just an example showing you the basic process. For more detailed how-to-do-it information, see the links in the Resources section at the end of the chapter.
First you see the Welcome page.
Click Next, and then enter your Azure Active Directory credentials.
Click Next, and then enter your on-premises AD credentials.
Click Next, and then indicate whether you want to store a hash of your AD passwords in the cloud.
The password hash that you can store in the cloud is a one-way hash; actual passwords are never stored in Azure AD. If you decide against storing hashes in the cloud, you'll have to use Active Directory Federation Services (ADFS). There are alsother factors to consider when choosing whether to use ADFS. The ADFS option requires a few additional configuration steps.
If you choose to store hashes in the cloud, you're done, and the tool starts synchronizing directories when you click Next.
And in a few minutes you're done.
You only have to run this wizard on one domain controller in the organization; the server must be running Windows Server 2003 or higher. And no need to reboot. When you're done, all of your users are set up in the cloud, and you can do single sign-on from any web or mobile application, using SAML, OAuth, or WS-Fed.

Thursday, May 14, 2015

Availability Sets

There are two kinds of maintenance events:

1. Planned maintenance events
2. Un-planned maintenance events

There are instances that require a VM reboot.

You can add one or more VM to the availability set.

Each VM in Availability Sets are assigned the following

a. Fault Domain
b. Update Domain

Virtual Network (vNet)

1. Creates a Virtual Network so Virtual Machines can communicate safely

2. Provides a layer of security by providing a layer of isolation

3. Can connect on premise computers to the Virtual Network in Azure via an Azure Virtual Network Gateway.

  • Site-to-Site - connecting on premise networks with Virtual Networks
  • “Hybrid Cloud”
  • Point-to-Site - connecting a single computer to a Virtual Network

4. When to use Virtual Network?
  • Create a dedicated private cloud-only network for VMs already in Azure
  • Securely extend your data center to the cloud
  • Enable hybrid cloud scenarios to connect cloud-based applications to on premise services (db, web services, etc.)