×

Securitybricks Launches FedRAMP Accelerator on ServiceNow Platform

Securitybricks releases the first FedRAMP accelerator built on ServiceNow platform for the recently released FedRAMP Rev 5 controls.

SAN FRANCISCO, CA, UNITED STATES, July 25, 2023/EINPresswire.com/ — Securitybricks announced today that it has launched the first FedRAMP accelerator built on ServiceNow platform for the recently released FedRAMP Rev 5 controls. It is now available on ServiceNow Store as a free download.

As an authorized FedRAMP Third-Party Assessment Organization (3PAO) and a ServiceNow Build partner, Securitybricks combined its in-depth understanding of control testing by incorporating continuous monitoring capabilities with control automation for 80% of the FedRAMP controls. The accelerator will enable Cloud Service Providers (CSPs) to reduce time for FedRAMP Authority to Operate (ATO) assessment using various data elements within their ServiceNow platform.

The accelerator comes with 320+ controls needed for FedRAMP moderate assessment and questionnaire samples built on the ServiceNow CAM (continuous monitoring and authorization). Out of the box, the accelerator comes with FedRAMP Rev 5 control content along with ability to build authorization boundary, a SSP (system security plan) and POA&M management.

Securitybricks FedRAMP solution extends the free accelerator which includes:

– Complete citations and authority documents for FedRAMP Rev 5 controls
– Content for inherited controls from Azure and AWS
– Ability to build authorization boundary using cloud workload data
– SSP document along with required FedRAMP ATO artifacts
– Connectors to AWS Security Hub and Azure Defender for cloud configuration and vulnerability data
– Supply chain controls automation
– Continuous monitoring reporting including POA&M

“We are excited to bring the first automated FedRAMP ATO solution built on the ServiceNow platform. In addition, all our solution implementations are backed by a free ‘mock 3PAO audit’ to guarantee the ATO package meets FedRAMP PMO requirements.” stated Raj Raghavan, CEO of Securitybricks.

##

About Securitybricks, Inc.

Securitybricks, Inc. is a cybersecurity consulting firm focused on cloud security and compliance. Based in the U.S., its team members are all U.S. Citizens, including military veterans, with over 15+ years of experience in implementing cybersecurity and regulatory compliance controls. https://securitybricks.io/.

About FedRAMP

The Federal Risk and Authorization Management Program (FedRAMP) is a United States federal government-wide compliance program that provides a standardized approach to security assessment, authorization, and continuous monitoring for cloud products and services. All Cloud Service Providers (CSPs) seeking to offer services to the Federal government are required to be assessed by a 3PAO.

Learn more about how Securitybricks can help you achieve FedRAMP Compliance at the FedRAMP Marketplace here.

Raj Raghavan
Securitybricks, Inc.
email us here

https://www.einpresswire.com/article/646297318/securitybricks-launches-fedramp-accelerator-on-servicenow-platform

Securitybricks Earns FedRAMP Third Party Organization Assessment (3PAO) Status

This accreditation enables Securitybricks to validate that service providers have implemented the required cloud security measures to protect government data.

SAN FRANCISCO, CA, UNITED STATES, April 18, 2023/EINPresswire.com/ — Securitybricks announced today that it has earned accreditation as a Third Party Assessment Organization (3PAO) under the Federal Risk and Authorization Management Program (FedRAMP). This accreditation authorizes Securitybricks to assess and certify cybersecurity controls for Cloud Service Providers looking to do business with any of 400+ US Federal agencies.

To earn the 3PAO accreditation, Securitybricks completed a comprehensive assessment conducted, over a span of 2 years, by the American Association for Laboratory Accreditation (A2LA), the FedRAMP 3PAO accreditation body to verify technical competence and quality management compliance to ISO/IEC 17020:202 standard.

Securitybricks will leverage its domain expertise in cloud security and control compliance, and its thorough understanding of NIST 800-53 control requirements to provide a suite of approved 3PAO services. In addition, Securitybricks has developed an automation approach that can shorten a CSP’s FedRAMP readiness timelines and reduce certification costs by 40%.

“FedRAMP is the first step in Securitybricks’ commitment to the Public Sector market. President Biden signed the FedRAMP Authorization Act in Dec. 2022, that aims to facilitate and accelerate secure cloud adoption by providing defined security authorizations, which opens the federal market to CSP’s of all sizes…” stated Raj Raghavan, CEO of Securitybricks.

###

About Securitybricks, Inc.
Securitybricks, Inc., a firm focused on cloud security and compliance. Based in the U.S., its team members all US Citizens, including military veterans, have over 15+ years’ experience in implementing cybersecurity and regulatory compliance controls. https://securitybricks.io/.

About FedRAMP
The Federal Risk and Authorization Management Program (FedRAMP) is a United States federal government-wide compliance program that provides a standardized approach to security assessment, authorization, and continuous monitoring for cloud products and services. All Cloud Service Providers seeking to offer services to the Federal government are required to be assessed by a 3PAO.

Learn more about how Securitybricks can help you achieve FedRAMP Compliance at the FedRAMP Marketplace here.

Katalin Pesti
Securitybricks, Inc.
3PAO@securitybricks.io

The Cyber Compliance Market

Recently, someone asked me to quantify the federal cyber market. 

FedRAMP is now a law that underlines the Government’s Cloud first mandate. After years of ambiguity and excessive costs to become FedRAMP certified to demonstrate data protection controls based on agency’s needs, the law now sets a level playing field for mid-size service enterprises who want to tap into the Federal market. The new law puts a system of reciprocity which allows federal agencies to certify vendors and have the same level of data protection more easily. 

While this law is appealing, the certification rules have not changed. Readiness is still a mountain to climb even with an understanding of the intent of NIST 800-53 controls and the applicability to the service provider’s environment. While the NIST requirements are complex, the cloud security architects and DevOps ability to design and implement the service within an approved boundary with appropriate data controls is no small feat. The demand for these cloud security professionals is very high.

Once you are FedRAMP certified, the burden to provide continuous monitoring reports that include reporting on incidents, security events, and scan for vulnerabilities while ensuring the new product features don’t cause “significant change” is an ongoing program.  

“Let us do the numbers” from my favorite NPR show Marketplace by Kai Ryssdal.

  • While 2022 saw the federal government spend over $11B in cloud technologies, the new bill signed in Dec. 2022 increases the spending
  • The Federal market is a long-term revenue stream with a market of 440 agencies          
  • Government agencies in 10 states have adopted FedRAMP and renamed it StateRAMP
  • FedRAMP is the security gate that will open the gates to these agencies
  • FedRAMP requires validation from a pool of 40 3PAOs
  • The lack of cloud security and application security professionals will further strain service providers ability to get certified quickly

The numbers are interesting but, where do you start?

  • Does your compliance team or security team understand NIST security framework?
  • Is your commercial cloud deployment aligned to security benchmarks or regulations?
  • Don’t let 1000 controls of NIST intimidate you. These are common sense cyber hygiene controls that are broken into domains that your information security probably has implemented
  • 3PAOs can offer guidance, but your FedRAMP readiness team should have cloud security engineers who can map current security tools and processes to NIST requirements
  • While AWS, GCP and Azure offer “FedRAMP Ready” GovCloud, see if it makes sense to implement your cloud software in the GovCloud and continuously monitor it
  • This is not a security tool game or FedRAMP ready “blueprint” but an assessment of your security controls and process to meet a slightly higher security requirement

There is a small battalion of certified assessors who can provide guidance and certification. The shortage of certified auditors is increasing timelines as many of us are now getting ready for CMMC, a DoD mandate, that impacts 300,000+ DoD subcontractors in 2023.

Tablet Command Partners with Credio, Inc. to Strengthen Cybersecurity Profile

San Rafael, CA, October 26, 2021—

Tablet Command is pleased to announce a partnership with cybersecurity advisory firm Credio, Inc. to ensure the continuous data protection of all Tablet Command systems and services.

Reports of several recent cyberattacks such as the Colonial Pipeline shutdown reveal that it’s more important than ever to ensure systems that impact public welfare are safe from hackers trying to disrupt infrastructure. Tablet Command’s software is saving lives in the middle of a global pandemic, and this partnership with Credio, Inc. safeguards their ability to do so securely.

“Tablet Command elected to use Credio; an independent Cybersecurity partner, in order to ensure objective assessments of our systems and services, ” said William Pigeon, Tablet Command CTO. “We are committed to avoid any internal bias based on the fact that we created these systems. Credio was selected after an exhaustive search, and the service they have provided has exceeded our expectations in every way.”

“The increase in cloud adoption and remote work since the Covid-19 pandemic has resulted in a dramatic increase in cyber-attacks. Credio, Inc. helps enterprises of all sizes implement relevant security and privacy controls to protect their digital assets. “We are humbled by the opportunity provided to us by Tablet Command, and Credio is honored to be able to make an impact in public welfare services.”, added Raj Raghavan, CEO of Credio, Inc.

About Tablet Command

Tablet Command provides a best-in-class emergency incident response and management solution to approximately 200 public safety agencies across the United States and Canada. The software delivers increased margins of safety for emergency responders on the ground by providing a complete picture of the scene and tracking more precise information. Tablet Command also creates operational performance data as a byproduct of the incident management process. This data and the operational improvements that can stem from it has never before existed in the public safety sector. For more information, please visit www.tabletcommand.com

About Credio, Inc.

Credio, Inc. helps its clients successfully gain cloud adoption by balancing security and compliance with digital experience. An ISO17020 accredited security advisory firm focused on Cloud security posture management (CSPM) and compliance, Credio helps clients build secure cloud environments with a team of industry experts that includes military veterans. To learn more visit www.crediopartners.com.

Alicia Perez
Credio, Inc
+1 888-682-9616
alicia.perez@crediopartners.com

Press Release: Credio Launches Managed Application Security Service

SAN FRANCISCO, CALIFORNIA, UNITED STATES, September 20, 2021
/EINPresswire.com/ —

Credio, Inc. is pleased to announce the launch of its Managed Application Security service. Driven by customer demand, Credio’s subscription platform enables customers to perform a source code security scan directly from their code repositories. Offered as a white glove service, Credio’s platform provides a combination of automated tools backed by security experts to help remediate vulnerabilities within the SSDLC process.

Credio’s Managed Application Security service is targeted to millions of developers who use open-source code bases, libraries, container, and Kubernetes applications. A recent survey noted that over 84% of commercial applications have some sort of open-source component.

This managed service is an extension to Credio’s secure code training and integration of security programs within DevOps teams.

“In response to recent software supply chain attacks, increased ransomware attacks and new contracts from our customers to help detect security vulnerabilities within developer IDE, we are excited to join the recent initiative led by Microsoft, Google, AWS in building secure software supply chains. Our managed service is another commitment to address the growing shortage of cybersecurity expertise” – Raj Raghavan, CEO of Credio, Inc.

The platform provides

· OnDemand source code scanning directly from code repositories · API Integration to CI/CD pipeline and JIRA for ticketing · Knowledge base of open-source vulnerabilities · Advisory services for threat analysis and remediation assistance · Secure code training on fundamentals of code security

About Credio, Inc.

Credio, Inc. helps its clients successfully gain cloud adoption by balancing security and compliance with digital experience. An ISO17020 accredited security advisory firm focused on Cloud security posture management (CSPM) and compliance, Credio, Inc. is helping clients build secure cloud environments with a team of industry experts that includes military veterans. To learn more visit us at www.crediopartners.com.

Alicia Perez
Credio, Inc
+1 888-682-9616
alicia.perez@crediopartners.com

Data Privacy Day: What Will Privacy Look Like Under a Biden Presidency?

What Will Privacy Look Like Under a Biden Presidency?

On January 28, it’s Data Privacy Day, where we all get to spend the day thinking critically about the importance of protecting our personal data online. Did you know that the reason Data Privacy Day falls on the 28th is because the Convention for the Protection of Individuals with regard to Automatic processing of Personal Data was opened for signature by the Council of Europe on this day in 1981? On January 20, 2020, Joe Biden was sworn in as America’s 46th President, so we thought it would be fitting to take a deep dive into how the new Biden Presidency might approach privacy over the next few years.

Will Privacy be a Priority?

In truth, Biden has been rather light on details when it comes to specifics around data privacy. There are a few signals however, that Biden may be a positive influence for advancing stronger privacy and data security protections. On the record: Biden stated in January 2020 that the U.S. should be “setting standards not unlike the Europeans are doing relative to privacy.” In the same interview, he also suggested that any Supreme Court nominees should have a strong recognition of the right of privacy. Foreign Policy: Biden’s Foreign Policy Plan laid out a vision for advancing the “security, prosperity, and values of the United States” by renewing alliances, strengthening our own democratic principles at home, and ensuring a level playing field in trade. This includes bolstering protections for data privacy, and ensuring adequate protections against cyber theft. Domestic Policy: Biden’s plans specifically call out the importance of considering diverse stakeholders when it comes to data protection. For example, Biden promises to take account of the “needs of the disability community when strengthening and enforcing data privacy protections,” and to ensure that adequate privacy protections are enforced when collecting data on LGBTQ+ people. A Biden-Sanders Unity Task Force issued recommendations in August which also cited the need to develop best practices around preventing student data sharing by for-profit organizations, curbing civil rights and personal privacy abuses around police use of body cameras, and setting guidelines regarding the use of biometric surveillance and information sharing at the border. It’s noteworthy that with regard to the reforms around immigration, the Biden-Sanders recommendations outline five of the seven GDPR principles — transparency, accuracy, accountability, fit for purpose, and timely. While Biden’s technocratic approaches often favor more data collection, it’s helpful to note that in most cases, sentences on data collection are followed by the importance of disaggregation of data, transparency and accuracy, to ensure privacy is maintained. Big Tech: Biden has emphasized the importance of reigning in Big Tech, by signaling that he plans to pursue antitrust actions and potentially repeal or reform Sec. 230 of the Communications Decency Act, which gives broad immunity to online platforms for content posted by users. He has called out privacy concerns and excessive data collection by firms such as Facebook, Google and others as one of the reasons that Big Tech needs another look. Biden Appointments: Biden is also surrounding himself with experts in privacy, tech and AI from the Obama administration, including:

  • Christopher Hoff (U.S. Department of Commerce) – Hoff will serve as deputy assistant secretary for services at the U.S. Department of Commerce, overseeing the U.S. Privacy Shield negotiations with the EU. He has an extensive privacy background, and has had a long career in the public and private sector in privacy matters. [IAPP Profile]
  • Robert Silvers (U.S. Department of Homeland Security Cybersecurity and Infrastructure Security Agency) – Silvers is expected to be appointed to lead the U.S. Department of Homeland Security’s Cybersecurity and Infrastructure Security Agency, a position formerly held by Christopher Krebs. Silvers is currently a partner at Paul Hastings, and is the vice chair of the firm’s privacy and cybersecurity practice. [Paul Hastings bio]
  • Alondra Nelson (OSTP Deputy Director) – Nelson is a professor at the Institute for Advanced Study, who studies societal impacts of emerging technologies, including AI and algorithmic impacts on bias, data privacy and corporate influence on research. [Wikipedia]

The VP, Kamala Harris, also has hands-on experience pushing privacy and consumer protections, both during her tenure as California’s AG, and in Congress.

Will There be a National Privacy Law?

The US’ byzantine system of patchwork, sectoral federal and state laws has made privacy compliance tough for business. Currently, all states have mostly sectoral (e.g., medical privacy, social security protections), laws on the books, but more states are looking to follow the lead of states like California and Maine in crafting broader legislation. All 50 states, plus the District of Columbia, Guam, and Puerto Rico also have data breach notice laws in place. It’s no secret that big tech firms are heavy political donors, and that compliance with dozens of disparate laws is far more costly than compliance with wholesale approaches like the GDPR. Privacy is also one of the few issue areas where bipartisan support is possible (albeit via very different means). That raises the question – will Congress push for a new federal Privacy Act? While many have speculated for years that a national privacy law is ‘on the horizon’, at best, we can offer only hopeful optimism.

Cross-Border Data Protection

In July 2020, the Court of Justice of the European Union (CJEU) invalidated the U.S. Privacy Shield, a mechanism used by many US firms to transfer data between the EU and US. In the case of Irish Data Protection Commissioner v. Facebook, Schrems, et al. (Schrems II), the CJEU found that the US’ broad surveillance powers, lack of notice to affected EU data subjects, and virtually no right of redress, meant that the US law did not meet the level of data protections necessary to meet adequacy requirements under the GDPR. This ruling has the potential to nullify countless numbers of cross-border transfers for organizations large and small. Despite the Court’s broad declaration of invalidity, that hasn’t stopped the EU and US from trying to work things out. Currently, this task is undertaken by the Deputy Undersecretary for the Department of Commerce, and the recent appointment of Hoff signals that such talks may be top of mind for the administration. That said, it’s highly unlikely that the US will reform broad surveillance powers currently granted to the three-letter agencies, so the likelihood of meeting the spirit of the GDPR’s broad data protection obligations seems unlikely.

Just When You Thought it was Bad Enough: The SolarWinds Attack

This year has been … a wild ride to say the least. 2020 has packed more in its yearly trip around the sun than some decades. First, there were the fires in Australia, Brazil, and California. Then came March, and the collective realization that things were never going to be what they were, even after the pandemic. Oh, and there was a presidential election that left everyone on edge, ongoing racial, economic, and political turmoil, and even a Brexit deal (of sorts). In short, we’ve all seen some things.

But this crazy year still had a bit more crazy to give us, and so on December 13, FireEye disclosed one of the largest, most sophisticated global intrusion & espionage campaigns in modern history, the SolarWinds supply chain attack. The compromise, which has been initially attributed to APT 29 (Cozy Bear), Russia’s foreign intelligence service, has affected at least 200 organizations directly (and potentially affected thousands more) around the world. Details are still being uncovered by the day.

A Quick Overview of the Attack

On December 13, FireEye disclosed that it had been the victim of a supply chain attack via the SolarWinds Orion platform, used to monitor and manage IT health. Attackers used digitally-signed certificates issued from the SolarWinds website to install an infected update package masquerading as a legitimate Orion software update. Once the payload was installed, communication with third-party servers was established allowing for remote access by the attackers. Then the payload removed itself and restored legitimate update files. With remote access, the attackers were able to gain additional credentials and move laterally throughout the network against specific targets. Current timelines project that the attack has been ongoing since at least March 2020, with the initial exploit going back to OctoberNovember 2019.

SolarWinds Malware Infection Chain — Microsoft Defender Research Team

The initial disclosure noted that one of the payloads, SUNBURST, had been used to conduct espionage against victim sites, and leveraged multiple sophisticated techniques to evade detection, obscure activity, and maintain persistence. One of the more clever aspects was the use of local IPs and dynamically-generated hostnames that match the victim’s environment, making the attack even more difficult to detect. There’s also potentially a second attack vector, known as SUPERNOVA that is still being investigated, but may be piggybacking on the SUNBURST vulnerability.

The attack’s complexity and many-pronged approach is complicated, highly technical, and worth a deeper dive. We’ve compiled a list of great resources to read over to better understand how the attack works (and what mitigations can be taken).

Why Supply Chain Attacks Are Spreading

We’ve talked before about the risk of supply chain attacks. Senior Consultant Carey Lening has given a talk about the growth of supply chain attacks across numerous industries, including finance, the maritime sector and industries.

What makes these attacks so challenging, is that organizations have limited control over the security posture of downstream providers. Even a Zero Trust security model is unlikely to have stopped the SolarWinds attack, as the tool itself had privileged access to enterprise servers. And despite what opportunistic vendors may be claiming, no single tool or service can prevent this.

Unfortunately, the best solutions to mitigate against future SolarWinds-style attacks tend to require buy-in from the top, both in terms of cost and resources, but also a willingness to fundamentally change how security is practiced internally. In short, a defense-in-depth, mature security model that emphasizes:

  • Thorough network and device hardening, as well as adherence to baseline best practices for security;
  • Comprehensive visibility of system and network activities;
  • Regularly sharing and updating threat data across industries, domains and tools;
  • Timely review and actioning of relevant threat indicators, including temporal analysis of compromised devices to understand lateral movements;
  • Isolation and prompt investigation of machines where known-bad file signatures have been detected;
  • Identification of compromised (or likely compromised) accounts.

Additionally, standards bodies, government regulators, and big industry players (looking at you, Microsoft, Amazon, Google, Apple, etc.), also need to step up and begin to enforce industry-wide changes. As the Atlantic Council notes in their detailed report on supply chain attacks, ‘Breaking trust: Shades of crisis across an insecure software supply chain’, support for robust, widely-compatible secure standards and code practice is paramount. Improving open source libraries is also another critical component that will take a village.

Finally, there should also be an emphasis on holding vendors and third party providers to account for their own security practices (or lack thereof). While there’s no such thing as perfect security, in the case of SolarWinds, security was … not exactly a priority. By rewarding firms with dollars for lackluster security practice, it sends a message that security isn’t a critical concern, and increases the attack surface.

In short, we’re all in this together, and we need to start acting accordingly.

My Brand! The Rise of the Elevated Spoof.

At Credio, we’ve written before about how COVID-19 is having an outsized impact on everything in daily life, including cybersecurity and privacy threats.

As users have been forced to leave the confines of hardened on-prem networks and turn to cloud and other hosted services, organizations have faced greater challenges, often with fewer resources at hand.

The Rise of the Elevated Spoof

For this week’s issue, we wanted to delve into one growing area of concern for organizations -the rise of domain spoofing and improved phishing techniques.

According to a recent report by F5 Labs, “55% of phishing sites made use of target brand names and identities in their URLs.”

Gone are the days of the poorly-written, dodgy sites that boasted exaggerated urgency and laughable spelling mistakes. Now criminals are going for sites that genuinely look and act like their targets — right down to the domain names.

More TLDs = More Opportunities to Wreak Havoc

So how are fraudsters doing it? It turns out, there are a number of techniques available.

Domain Name Spoofing: Fraudsters are learning that, thanks to the wonders of hundreds of top level domain names, it’s still easy to register a deceptively-similar looking domain name, clone a target’s login page, and blast out a link to the user.

For example, let’s say you’re interested in mimicking the domain name for apple.com. Obviously, Apple has already registered all the apple.* domains that most of us can think of. But as more top-level domains and ccTLDs (country-code TLDs) come online, it becomes a game of whack-a-mole to keep up.

Compounding this is the rise of free and low-cost domain registrars such as Freenom and DotTK, which provide inexpensive (and sometimes free) domain registrations, even of popular domains.

Unicode and IDNs: Add to that, the problems that come from our multilingual world — and the Unicode standard. While Unicode is a great equalizing force by opening up opportunities for non-Western speakers to be heard — providing text for most of the world’s writing systems has a cost — it gives criminals a bigger sandbox to play in.

IDNs, or Internationalized Domain Names — use the power of the Unicode standard to allow organizations to connect online in local languages. IDN registrars are still few and far between, but some will allow fairly convincing-looking registrations — for example applе (the Cyrillic capital letter Ie, in lowercase).

Punycode: But say you’re a scammer, and committed to keeping the domain in the .com TLD. Now, GoDaddy won’t accept non-ASCII unicode character sets, so your plans for applе.com likely won’t fly. Enter punycode.

Punycode is a way of converting letters that cannot be written in ASCII into Unicode ASCII encoding. Using punycode, you can include non-ASCII characters within a domain name by generating a “bootstring” encoding of unicode. Here’s the punycode for applе.com – xn--appl-y4d.com (which can be registered for around $11).

On certain vulnerable browsers (and especially mobile devices, where eyeballs are contending with smaller screens, shrunken urls, and the inability to hover a mouse), it renders the page as applе.com, which looks surprisingly legitimate, especially if you’re clicking on a link and might be distracted, or haven’t had your first cup of coffee. Throw in a free Letsencrypt TLS certificate, and it’s a very convincing-looking fraud opportunity.

Site Cloning: Even the practice of cloning a website can often be trivially easy. One need only visit the target’s website, save the page, and extract the HTML, CSS, images and other elements, and upload that content to a hosting site. With a few alterations, a fraudster can make a rather convincing-looking site at https://applе.comthat might confuse even the most skeptical among us.

While phishing attacks will only continue to improve so long as there are victims to be had, it’s important that awareness, security controls, and the tools we use, also continue to evolve with the threat. Our eyes can’t do it alone.

Resilience: Planning for When the Clouds Go Away

Resilience: Planning for When the Clouds Go Away On November 25, AWS suffered a major service disruption in its Northern Virginia (US-East-1) region. It left the region out of commission for over 17 hours, and took thousands of sites and services, including Adobe, Twilio, Autodesk, the New York City Metropolitan Transit Authority, The Washington Post offline. When Planned Upgrades Go Wrong In a post-mortem, Amazon described how a planned capacity increase to their front-end fleet of Kinesis servers led to the servers exceeding the maximum number of threads allowed by the current configuration. As the servers reached their thread limits, fleet information, including membership and shard-map ownership, became corrupted. The front-end servers in turn were generating useless information, which they were propagating to neighboring Kinesis servers, causing a cascade of failures. Additionally, restarting the fleet appears to be a slow and rather painful process, in part because AWS relies on this neighboring servers model to propagate bootstrap information (rather than using an authoritative metadata store). Among other things, this meant that servers had to be brought up in small groups over a period of hours. Kinesis was only fully operational at 10:23pm, some 17 hours after Amazon received the initial alerts. Finally, the failures with Kinesis also took out a number of periphery services, including CloudWatch, Cognito, and the Service Health and Personal Health Dashboards, used to communicate outages to clients. For reasons that aren’t totally clear, these dashboards are dependent on Cognito, and may not be sharded across regions. Essentially, the post-mortem seems to imply that if Cognito goes down for a region, affected customers in that region will have no way of knowing. Resiliency, or How to Survive the Next Outage While Amazon posted a number of lessons learned in their post-mortem, which are all worth reading, today we wanted to discuss what customers can do to limit their own risks in the Cloud:

  • Build Outages into your BCP and IRP: While providing continuous service and support is the main goal, we should all be mindful of worst-case-scenarios. That means identifying critical workloads and applications, considering and having a plan to execute fail-over options, and ensuring that customers can be alerted when a failure can’t be avoided. Build response and recovery plans around these considerations.
  • Housing Critical Workloads in Multiple Availability Zones (AZ): Since the AWS outage appears to have been isolated to a single region, organizations that relied on hosting their critical systems across AZs were less impacted when US-East-1 went down. Consider services like AWS’ Multi-Region Application Architecture to fall over to a backup region. These features are not available by default, however, and must be built into an organizations’ overall architecture plan.
  • Use Amazon Route 53 Region Routing: Another best practice for ensuring geographic distribution is to use Route 53 to route users to infrastructure outside of AWS, whether it’s another cloud provider, or a more minimalist on-prem backup service.
  • Test for Your Worst Case: Adopt Netflix’s ‘Chaos Engineering’ model to test what happens when networks, applications, or infrastructures go down, and develop a road-tested plan for how to work around those failures.