Source IP Based Conditional Access for SaaS – Wait You Put Donuts On Your Ferrari?

No one adopts SaaS applications with the intent of them being a slower and less productive user experience versus their previous on-premises instantiations. Yet, in the age of cloud based SaaS applications and mobility (aka Work From Anywhere) performance impacting Source IP Based Conditional Access controls is somehow still a thing thats popping up in otherwise modern enterprise network and security architecture

A common example of this that I have often seen is an enterprise requiring that end user access to Office 365 be limited to only users who are coming from a known corporate source IP address. This has several unfortunate side effects, the largest being the unnecessary performance impacting latency introduced in forcing both on and off-network end user traffic back through the corporate data center further straining any centralized outbound security hardware appliances in the process. Another side effect is a poor end user experience in making them jump through hoops such as having to turn on a remote access VPN just to get access to a public cloud hosted SaaS application.

When we have open discussions with enterprises and start to break down why this is still a thing we often land on a common set of requirements that center around reduction of risk:

1.) User identity alone isn’t good enough, we need to know that the user is coming from one of our managed devices with our AV/EDR installed

2.) We want to ensure that the user is passing through our outbound security controls and that we have visibility into this key SaaS application traffic so we force them to come back on-premises

3.) We want to prevent intentional or accidental data loss 

While these are absolutely important risk factors that we need to account for in our SaaS adoption strategy, we really need to rethink the mechanisms by which we employ our controls so as not to defeat the original goals of migrating to SaaS applications that no longer live on our network in the first place.

Let’s look at some of the more modern alternative methods of implementing SaaS access risk reduction while not adversely impacting performance and end user experience.

Access from managed vs unmanaged devices

Ideally this can be controlled by leveraging a combination of your SAML identity provider’s conditional access criteria and an inline cloud security solution to assess the posture of the device that the end user is requesting SaaS access from. Alternatively, we may want to grant ‘reduced’ access in the event that the end user is not coming from a managed device such that they can still reach the SaaS application, yet don’t get risky full unfettered access.

Inline Security and Visibility

For WFA scenarios this doesn’t have to mean bringing the user back onto the corporate network in order to securely access SaaS. The reality here is that we really should be assessing a combination of the current device posture and real time end user risk profile based on recently observed potentially risky behavior. Another potential attribute is where the user is coming from, not based solely on their source IP address, but through the lens of detecting ‘impossible travel’ where the user access request is coming from a completely different geo than where the user’s most recent internet/SaaS bound traffic is coming from.

Preventing Data Loss

There are multiple different potential avenues when it comes to preventing data loss from a SaaS application. An inline DLP solution (preferably cloud based) combined with an API based cloud security access broker and endpoint DLP would all help to curtail the risk of data loss without forcing SaaS access back through a centralized set of on-network security controls.

These of course work fine for our managed endpoints, but what about unmanaged? A combination of Identity Proxy and cloud browser isolation can be a powerful tandem. With an Identity Proxy in place an end user coming from an unmanaged device can be redirected by your SAML IDP to an inline cloud security solution which can now redirect this unmanaged user SaaS access request into an isolated cloud browser. This isolated browser allows the user to still get access to the SaaS application, but their potentially ‘dirty’ unmanaged device doesn’t interact directly with it. From a data protection perspective read-only access controls can be enforced to prevent file uploads/downloads and inputting of any data into the SaaS application.

Photo by José Pinto on Unsplash 

Stepping Back On the Gas for SaaS

In summary there are far more thorough and effective methods of injecting risk reducing controls for SaaS access into your enterprise network and security architecture.

So let’s put the Michelin Pilot Sport tires back on our new Ferrari and migrate away from those legacy fixed location based controls. This will empower our users to be more productive wherever and whenever with out sacrificing the security and visibility of our key SaaS applications.

Disclaimer: The views expressed here are my own and do not necessarily reflect the views of my employer Zscaler, Inc.

How hard is the Zero Trust journey?

Just how hard is it get started with Zero Trust? If its truly a journey then when should an enterprise expect to start seeing the benefits?

What prompted me to write this post was a recent review of NIST Special Publication 800-207 “Zero Trust Architecture” authored in August of 2020. It first nicely lays out the fundamental principles of Zero Trust which I will quickly summarize here.

  • Zero Trust is a model which aims to reduce the exposure of resources to attackers and minimize or prevent lateral movement within an enterprise should a host asset be compromised.
  • Trust is never granted implicitly and should be continuously verified via authentication and authorization of identity and security posture upon each access request.
  • There is no complete end to end Zero Trust product offering. It is instead an ecosystem of component systems that when properly integrated allow for creation of a Zero Trust Architecture.
  • Implementing a Zero Trust Architecture is a journey which is all about continuing to reduce risk.

110%, could not agree more with their explanation of the tenets of a Zero Trust model

There is also a really good explanation of all the vital ecosystem components required to interact with each other in order to facilitate translation of Zero Trust principles into an implementation of a Zero Trust Architecture.

However, Section 7, “Migrating to a Zero Trust Architecture”, was a little discouraging for the reader. Reading this section makes it seem like an arduous and daunting task to move towards a Zero Trust Architecture in order to start reducing risk. After some poking around and seeing comments on various public forums I’m apparently not the only individual who had this as a take away. Is it really this hard to get started?

There is an assumption made here in Section 7 that in order to make progress on the zero trust journey an enterprise must first understand where all of the existing enterprise resources are and who needs to have access to these and that if not done prior, any attempt at initial implementation will prevent access to key resources…in other words you will break things and prevent users from getting their work done.

Fortunately since the time of NIST 800-207 was published in August 2020 there have been signficant gains in the maturity of the Zero Trust ecosystem ranging from enhanced functionality of Identity Providers, Integrations with EDR vendors for device context/posturing and even advances in automation of access policy. Thanks in a large part to the COVID-19 pandemic a lot of operational insight has also been gained into how to transition an Enterprise towards Zero Trust.

Most importantly in getting started with Zero Trust, there are commercially available traditional VPN alternative offerings that are a piece of the Zero Trust Architecture ecosystem for which step 1 in their implementation is to actually facilitate this application to end user + user device access patterns discovery. This can be done without concern of inadvertently removing any previously granted ability for a user to access a key required application resource while providing additional risk reduction benefits that are worth mentioning. I will quickly summarize some of these below.

Potential benefits of the initial phase of a Zero Trust Architecture rollout

  • This one bears repeating and expanding on slightly – Immediate granular visibility into all of the applications users are requesting access to, at what time, from which device, from where and for how long which can be then fed into your SIEM. Discover exactly which private resource assets exist and where they actually physically reside. Yes, you will inevitably discover Shadow IT and realize that you have way more applications than you had originally thought 😉
  • Kick the remote users off the internal private network – Once all users are off the network there is no longer a network-centric implicit trust. Determining trust for whether an individual application access request is approved is now based on a continuously assessed combination of user identity and contextual attributes. Application access, not network access, also reduces the risk of lateral propagation of malware
  • Removal of a public facing inbound VPN listener which can be DDOS’d or compromised – This is a huge risk reduction given all of the reported CVE’s in 2020/2021 for RCE vulnerabilities

What’s Next?

So where does one go next after Phase 1?  Phase 1+ is about assessing the discovered user to application resource workflows and then selectively removing more and more risk by locking down access via policy to key applications to only required groups and individuals. Think of these as ‘Crown Jewels’ applications and internal infrastructure components where compromise and potential data exfiltration will be the most costly.

Implement device posture profiles which further provide device context and take advantage of any potential endpoint integrations that provide additional risk assessment scoring for the device that can be used in access policy. An enterprise should also immediately start to look to restrict 3rd party access to only the resource(s) that are required. This is really all about continuing to move towards more identity and contextual least privilege access around the things that are most vulnerable in order to continue to reduce risk.

The maybe not so obvious benefits of migrating towards a Zero Trust Architecture

  • Improved performance – For applications being served out of an IaaS cloud like Azure or AWS an authorized user on a postured device can now connect more directly to that private resource as opposed to be being backhauled to a centralized location and then connected out over private links to an IaaS Provider whose Data Center is most likely closer to the remote user than the centralized interconnect point. A user can connect directly to private apps in multiple different locations simulataneously
  • Improved user experience – “Always on Identity and Contextual based least privilege access”. There is no longer a concept of having to be on or off-VPN, its just in time connectivity to any authorized user on an appropriately postured device to any private application anywhere without any change to the way the user would go about accessing the application.
  • Zero Trust isn’t just for remote access – Since Zero Trust is focused on not implicitly trusting the user device’s network location the ability to extend zero trust policy for on-prem users who are already resident on the internal corporate network is a huge plus. To do this the vendor technology must support intelligent interception of client application resource access requests and forward those to an on-prem policy enforcement point as opposed to allowing traditional direct network level access to the requested target resource simply because network reachability exists
Excellent summary of how to get started with ZTA adoption – be sure to check out the full Zero Trust Adoption Best Practices video here

Hopefully the reader finds this helpful and if interested in a tailored phased plan for how to get started on your Zero Trust journey feel free to reach out to your local Zscaler Solutions Engineer or attend one our user group events where you can connect with other enterprise customers who have already embarked on their Zero Trust journey

For additional insights into operationalizing Zero Trust check out this timely podcast “Maturing zero trust via an operational mindset” featured on our CXO REvolutionaries site

Disclaimer: The views expressed here are my own and do not necessarily reflect the views of my employer Zscaler, Inc.

Never Waste A Good Crisis

While listening to a recent Podcast featuring one of my colleagues, a former Enterprise IT executive, she mentioned the phrase “never waste a good crisis”.  The last time that I had heard that phase was during the 2008 financial crisis.

“You never want a serious crisis to go to waste. And what I mean by that is an opportunity to do things that you think you could not do before.”

― Rahm Emanuel

The Crisis

Over the last 16 months I’ve listened to a similar story from IT leaders all over the globe across almost every industry vertical who sent employees home on a Friday only to have to figure out on Monday how to get them access to the Enterprise applications they needed to do their job. 

The seemingly ‘easy’ route was to lean on traditional remote access VPN. However, that VPN hardware was originally scoped to support ~35-40% of an Enterprise’s employees working remotely at any given time. Those VPN concentrators became immediately saturated and it was nearly impossible to quickly instantiate additional capacity in a traditional appliance based model. In order to attempt to alleviate some of the capacity strain and end user performance impact of being backhauled to the corporate WAN via VPN, Enterprises were forced to implement split-tunneling policies allowing for higher bandwidth and performance impacted applications like O365, Zoom, WebEx and Teams to be split off from the VPN and go direct to Internet creating risk via a security and visibility gap

The Opportunity

While some forward thinking IT leaders had envisioned and already shifted employee remote access away from a traditional VPN paradigm to an elastically scalable cloud delivered zero trust architecture, others had been considering for some time how they might embrace more of a zero trust paradigm and kick their users off the corporate network. Well, here is where the opportunity emerged amid crisis….COVID-19 sent almost everyone home and kicked the users off the corporate network. The conundrum facing IT leaders was now no longer how do I go about kicking the users off the corporate network, but how do I go about securely providing them just enough access to a limited set of things that they need in order to do their jobs.

Tackling Legacy Technical Debt

Some legacy enterprise IT applications that were designed around a fundamental principal of having direct network access to user endpoints.

In response to this challenge many Enterprises went about investigating how to either adapt the way these legacy network application communication flows worked from a push model (server-to-client) to a pull model (client-to-server) or abandon on-prem hosted entirely in favor of adopting cloud delivered instances of the same function that would address providing these functions to off-network users without having to bring a user back onto the internal private network. In my many discussions with global Enterprises I had the benefit of learning what their IT teams had done to adjust to this crisis and below I list some typical examples of migrations away from traditional on-prem hosted server to client communication models to client to server and remote work friendly cloud hosted models.

It is very important to note that this is not a recommendation or endorsement on my part of any of the specific products below, simply just a recap of what I have seen come up in conversations with prospects and existing customers.

Patch Management – Pulling down software updates over traditional VPNs leads to several problems. First, if there is a large number of off-net users attempting to pull patches that can put significant strain on your existing VPN concentrators and internal network bandwidth. Secondly, if the patch management system is only available to remote users over VPN then they need to know to turn on the VPN in order for the patch update to even happen which can lead to lag time where some systems remain unpatched and vulnerable. Instead of forcing users to come back onto the corporate network over VPN for patch management, some enterprises moved to completely cloud hosted (SaaS) implementations of patch management (cloud management gateways) that allowed users access to updates directly over the public internet without care or concern around internal private access or whether their VPN was turned on. Others who already had replaced their traditional VPN solution with a Zero Trust no network access offering simply flipped their patch management to leverage a pull model whereby the client device would ‘check-in’ at regular intervals looking for updates. In either event both approaches result in preventing any significant patch update lag that would otherwise lead to unpatched systems and increased risk.

Remote Desktop Support Management – When users need support personnel to access their machine in order to help with troubleshooting an issue it was not uncommon to leverage things like Remote Desktop from the support engineer’s PC to the end user’s endpoint. This type of connectivity model clearly only works when there is direct network connectivity from IT support to the afflicted end users machine. Several implementations exist which allow for “meet in the middle’ type of remote support access where a cloud delivered solution can securely enable IT support personnel to remote access an end user device regardless of what network it resides on.  While clearly not an exhaustive list, some popular examples I have heard mentioned are Microsoft Quick Assist, BeyondTrust Remote Support (formerly Bomgar), ConnectWise and TeamViewer.

Legacy VOIP – For VOIP Softphone usage and for Call Center VOIP implementations it was common to see offerings like Avaya and Cisco Call Manager that leveraged traditional direct network access in order to function properly. What was pretty common across all enterprises was that the end user base dependent on these systems represented a small percentage of the total overall end user count.  Some enterprises simply maintained a much smaller deployment of traditional remote access VPN technology to address this user base while adjusting their IT planning and budgets towards future deployments of UCaaS (Unified Communications As A Service) implementations of these types of systems with the eventual goal being to retire traditional remote access VPN entirely. Others expedited existing plans and completely migrated users toward UCaaS implementations like Teams, WebEx and Zoom. For Call Center applications some enterprises are looking to adopt CCaaS (Contact Center As a Service) solutions like Genesys Cloud, Amazon Connect Contact Center, Five9 or Nice’s InContact.

Vulnerability Scanning – Having to have a user’s device brought onto the corporate network in order to scan it for vulnerabilities runs counter to the goal of a zero trust model. If we do vulnerability scanning because we don’t trust that the end user’s device isn’t compromised then why would we want to risk bringing a potentially compromised endpoint onto the network where that compromise can potentially spread laterally? If we look at what happened with the SolarWinds supply chain compromise as an example, then its probably not the best of ideas to give a tool complete unfettered access to everything on the entire internal private network. Having a good modern EDR tool like CrowdStrike, CarbonBlack, SentinelOne or Windows Defender combined with an agent providing asset vulnerability data that can phone home to the cloud from an off-network remote user device runs more inline with the goals of a zero trust model.

In summary, those Enterprises that had seized the opportunity to accelerate shifting away from legacy network-centric applications like the ones covered above towards cloud delivered models didn’t need to bring their end users back onto the corporate network. They were able to migrate away from traditional remote access VPN model to a more identity and context based least privilege access model where users only get access to the applications the need to do their job, not access to the internal private network. By ‘keeping the users kicked off the network’ not only are these organizations inherently more secure, but they can now start to evaluate further security and network transformation starting with “if my users can do their jobs off of the internal corporate network, then do we even need to operate a traditional internal private WAN anymore?”.

Curious to hear thoughts and experiences from others on the challenges they faced in addressing the shift to remote work. Feel free to leave a comment !

Disclaimer: The views expressed here are my own and do not necessarily reflect the views of my employer Zscaler, Inc.

Making The Case For SSL Inspecting Corporate Traffic

Almost every stakeholder, from Enterprise Security Architect to CISO that I speak with these days wants to be able to inspect their organization’s encrypted traffic and data flowing between the internet and the corporate devices and end users that they are chartered to safeguard.

When asked what are their primary drivers for wanting to enable SSL/TLS inspection the primary top of mind concerns are as follows:

  • Lack of visibility – Upwards of 75-80% of our traffic headed to the internet and SaaS is SSL/TLS encrypted
  • We know that bad actors are leveraging SSL/TLS to mimic legitimate sites to carry out phishing attacks as well as hide malware downloads and Command and Control (C&C) activities
  • I need to know where our data resides – We know bad actors are using SSL/TLS encrypted channels to attempt to circumvent Data Loss Prevention (DLP) controls and exfiltrate sensitive data. Our own employees may intentionally or unintentionally post sensitive data externally

With a pretty clear understanding of the risks faced by not inspecting SSL/TLS encrypted traffic one would assume that every enterprise has already taken steps to enable this right? Well…not neccessarily. There are 2 main issues to overcome in order to implement this initiative, one is a technical hurdle, the other is a political hurdle.

The technical hurdle is essentially ensuring that your enterprise network and security architecture supports a traffic forwarding flow for both your on-prem and off-net roaming users which traverses an active inline SSL/TLS inspection device capable of scaling to the processing load imposed by 75-80% of your internet and SaaS bound traffic being encrypted. In an enterpise network and security architecture where all end user traffic, even remote users, flows through one or more egress security gateway stack choke points comprised of traditional hardware appliances the processing load imposed in doing SSL/TLS interception dramatically reduces the forwarding and processing capacity of those hardware appliances as evidenced in recent testing by NSS labs.

This is critical in that most enterprises would need to augment their existing security appliance processing and throughput capacity by at least 3x to enable comprehensive SSL/TLS inspection. This constitutes a signficant re-investment in legacy security appliance technology that doesn’t align with a more modern direct to cloud shift in their enterprise network and security architecture design

The second concern, and the primary topic of a recent whitepaper issued by Zscaler, is balancing the user privacy concerns of SSL/TLS inspection versus the threat risks of not inspecting a enterprise’s corporate device internet traffic.

Some of the key things to consider in the privacy vs risk assessment and subsequent move to proceed with an SSL/TLS inspection policy are as follows:

  • An organization can not effectively protect the end user and the corporate device from advanced threats without SSL/TLS interception in place
  • An organization will also struggle to prevent sensitive data exfiltration without SSL/TLS interception
  • Organizations should take the time to educate their end users that instituting an SSL/TLS inspection policy is a security safeguard and not a ‘big brother’ control
  • Organizations should inform employees as to the extent of what will and will not be inspected. This should be defined as part of an acceptable usage policy for internet use on corporate issued assets and this policy should be incorporated into their terms of employment agreements
  • Organizations should review this policy with in house legal counsel, external experts and any associated worker’s councils or unions as well as paying careful consideration to regional data safeguard compliance frameworks like GDPR
  • Organizations should take the neccessary steps to ensure appropriate safeguards are put in place for the processing and storing of the logs associated with decrypted transactions such as obfuscating usernames

For a more comprehensive review of how to navigate the security vs privacy concerns and implement a successful SSL/TLS inspection campaign take a look at the recent whitepaper that Zscaler has authored – https://www.zscaler.com/resources/white-papers/encryption-privacy-data-protection.pdf

Disclaimer: The views expressed here are my own and do not necessarily reflect the views of my employer Zscaler, Inc.

14.4 Terabits in a single rack unit ?

Have switching ASICs gotten too fast?

Looking back at the last few years it certainly appears that ethernet switching ASICs and front panel interface bandwidth are clearly moving at a different pace in that a faster switching ASIC comes just ahead of the required ethernet interface speed and optic form factor size required to drive the full bandwidth the ASIC actually provides while still fitting into a 1RU top-of-rack ethernet switch or line card profile.

Current 6.4+ Tbps system on-a-chip (SOC) ASIC based switching solutions have moved past the available front panel interface bandwith inside of a single rack unit (RU).  The QSFP28 (Quad-SFP) form factor currently occupies the entire front panel real estate of a 1RU switch at 32x100G QSFP28 ports prompting switching vendors to release 2RU platforms in order to cram 64x100G ports and fully drive the newest switching ASICs. With higher bandwidth switching ASICs on the near horizon the industry clearly needs a higher ethernet interface speed and new form factors to address the physical real estate restrictions.

So where do we go from here?

First lets look at the 3 available dimensions at our disposal for scaling up the interface bandwidth.

1.)  Increase the symbol rate per lane.

This means we need an advance in the actual optical component and thermal management used to deliver the needed increase in bandwidth in a power efficient manner.  Put more simply in the words of a ceratin  Evil Scientist who wakes up after being frozen for 30 years “I’m going to need a better laser okay”.

2.)  Increase the number of parallel lanes that the optical interface supports

As an example in the case of the 40Gbps QSFP form factor this meant running 4 parallel lanes of 10Gbps to achieve 40Gbps of bandwidth

3.)  Stuff (encode) more bits into each symbol per lane by using a different modulation scheme.

For example PAM4 encodes 2 bits per signal which effectively doubles the bit rate per lane and is the basis for delivering 50Gbps per lane and 200Gbps aggregate across 4 lanes.

Looking Beyond QSFP28

Next looks look at what is potentially coming down the pike for better interface bandwidth (greater than 100Gbps) and front panel port density.

Smaller form factor 100G

One approach is to simply used a more compact form factor and this is exactly what the micro QSFP is being designed to do.  uQSFP is the same width as an SFP form factor optic yet uses the same 4 lane design of QSFP28. This translates into a 33% increase in the front panel density of a 1RU switch when compared with the existing QSFP28 form factor. The uQSFP also draws the same 3.5W of power as the larger form factor QSFP28.  Its now going to be possible to fit up to 72 ports of uQSFP (72x100G) into a 1RU platform or line card allowing for the support of switching ASICs operating at 7.2Tbps when the uQSFP runs at 25Gbps per channel (4 lanes of 25Gbps).  If broken out into 4x25G ports a single 1RU could terminate up to 288 x 25G ports.  uQSFP is also expected to support PAM4 enabling 50Gbps per channel for an effective bandwidth of 200Gbps in a single port paving the way for enough front panel bandwidth to drive 14+Tbps of switching ASIC capacity in a 1RU switching device form factor.  There may however be technical challenges in engineering a product with 3 rows of optics on the front panel.

Image courtesy of http://www.microqsfp.com/

Double-Density Form Factors

Another approach is the QSFP-DD (double density) form factor.

QSFP28-DD is the same height and width of QSFP28, but slightly longer allowing for a second row of electrical contacts.  This second row provides for 8 signal lanes operating at 25Gbps for a total of 200Gbps in the same amount of space as the previous QSFP28 operating at 100Gbps.  This provides enough interface bandwidth and front panel density for 36 x 200Gbps and a 7.2Tbps switching ASIC.  There are break-out solutions coming that will allow for breaking out into 2x100Gbps QSFP28 connections with QSFP-DD optics on the 100G end.   What is not yet clear is whether a product will emerge which would allow for 8x25G breakouts of a QSFP28-DD into server cabinets.

400G

CFP8 is going to be the first new form factor to arrive for achieving 400G, but is going to be too large a form factor to fit into the more traditional model of 32 front panel ports in 1RU of space.  CFP8 dimensions are W 40 x L 102 x H 9.5 which should max out at around 18 ports per 1RU of space.  At 15-18W (3x the power of QSFP28), power consumption is another challenge in designing a line card that can accomodate it.  CFP8 is more likely to be used by service providers for router to router and router to transport longer haul transmissions rather than traditional ethernet switching devices found in the Data Center rack.

QSFP56-DD consists of 8 lanes of 50Gbps with PAM4 modulation for 400Gbps operation.  Its the same size form factor as QSFP/QSFP28 allowing for up to 36 ports in 1RU of space and flexible product designs where QSFP, QSFP28 or QSFP56-DD modules could alternatively be used in the same port.  These 36 ports of 400Gbps would support ASICs with 14.4Tbps in a single RU of space.  QSFP56-DD should also support short reach 4x100Gbps breakout into 4x SFP-DD which is the same size as SFP+/SFP28 making it eventually ideal for server connectivity.

Octal SFP (OSFP) is another new form factor with 8 lanes of 50Gbps for an effective bandwidth of 400G.  Its slightly wider than QSFP, but should still be capable of supporting up to 32 ports of 400G, a total of 12.8Tbps in 1RU of front panel space.  The challenge for OSFP adoption will be that its a completely different size form factor than the previous QSFP/QSFP28 which will require a completely new design for 1RU switches and line cards.  In other words there will be no backwards compatability where a QSFP/QSFP28 could be alternatively be plugged into the same port on line card or fixed switch. An adapter for allowing a QSFP28 optic to be inserted into the OSFP form factor is apparently under discussion.

So in conclusion just while ASICs seemed to be quickly outpacing interface bandwidth and front panel real estate there are viable options coming soon that will be able to take us to the 12.8 to 14.4Tbps level in a single RU.

Disclaimer: The views expressed here are my own and do not necessarily reflect the views of my employer Juniper Networks