Zscaler is scheduled to be added to the pile of corporate malware currently infesting my company-issued MacBook.
I’d like to better understand how it works and the implications for some of my workflows.
There’s a classic interview question: “Tell me in as much detail as possible what happens when you type ‘google.com’ into your browser and hit return…” Can we run through that exercise for a Zscaler-enabled client?
What’s really going on with name resolution? Who terminates the browser’s TCP session? The TLS session? What’s the role of the local listener and the CGNAT range?
Some examples of the kinds of workflows I’m worried about:
- I use a lot of SSH port forwarding to reach into “isolated” labs accessible only via SSH jump boxes. Mostly local ports (
-L
), sometimes remote ports (-R
).
- I lean heavily on Wireshark with TLS session key logging for debugging https traffic originated by my machine. How/where will I find the client packets if they’re not on my
en0
(wifi) interface?
- Some of the applications I work on don’t rely on the OS-managed list of trusted certificates, don’t rely on DNS (peers are signaled via an in-app mechanism) and don’t send TLS SNI headers.
- How is non-https traffic handled?
If there’s a “how it works” thing I should be reading, feel free to steer me there. Most of the answers I’ve gotten from folks responsible for the deployment have been focused on features (what it can do) and not the nuts-and-bolts of how it does its work.
They tell me I’ll be able to switch Zscaler off (for a few hours, then it re-enables) after it’s deployed, but I’d rather work within the system than have to fight against it. I just have no idea what to expect.
Thanks!
edit: I’ve heard them mention both ZIA and ZPA, so I think I’ll be getting both.
Zscaler is a network security platform, with a number of different products that can each be deployed/used in a number of different ways. With just the info you have, it’s tough to tell which products your company has, or how they intend on using them.
Your question is a bit like asking “how does Microsoft work?”
The main thing to realize is that Zscaler really doesn’t have anything to do with your machine. The agent is lightweight and basically just forwards traffic to the cloud.
The most commonly used products are:
ZIA: think of this as a NGFW in the cloud. There are hundreds of different options and features that could be used, and several different ways your traffic could get there.
ZPA: this often gets compared to VPN, but it’s nothing like VPN, it’s essentially a switchboard that facilitates sessions between you and private apps hosted in a DC, on the Internet, or in a private cloud environment.
I would check with your company, as there’s no way for us to tell you how your company is going to deploy it, or which features/products they are using. None of your use cases are unique, and are things I have seen hundreds of customers do, so I don’t see any issues doing those, but your company would be the best bet for more info.
I loved the “corporate malware” definition for ZScaler lol. On point!
For us in the software development department it broke pretty much ALL secure sites and package managers (for managing software dependencies, like composer, npm, yarn, maven, you name it) for all projects. It also broke the update system in ALL of the linux boxes.
But that’s because the people in charge had no idea and ZPA was poorly implemented and didn’t do any prep work at all, TBH. Man, they barely knew how SSL works. No wonder why companies go to these kind of services like ZScaler.
Anyways, a good ZScaler fanboi knows that you should list all “safe” sites and resources first, which is what they ended up doing. For the linux boxes, they added the ZScaler certificate to the trusted list and voila, all traffic was being spied on successfully without any errors 
There’s a lot of “depends” to your questions…
Most of it depends on what ZScaler products your company is implementing. ZIA or also ZPA.
In top of that, what policies they’ll be applying?
If ZIA only, local traffic won’t be impacted.
Zscaler tunnels traffic to a cloud environment to apply various security policies FW / DNS / SSL / URL / File type / etc. any of these could impact your activities.
I can tell you your wireshark will most likely not work unless they let you manually apply your own proxy settings.
Tools that don’t use OS cert store, will break if they implement SSL inspection. You can upload ZScaler cert to some, or you’ll need to get them to bypass the destination from ssl inspection.
They can also decide whether to only forward 80/443 or all ports and protocols.
Plenty more to unpack. In the end they’re not trying to make your life harder but keep your company in business and avoid a major incident. Work with IT and they’ll get you squared away for any business needs you have.
You’ll have a utun interface for ZCC. Your en0 interface would just have the encrypted tunnel to ZScaler (assuming ZT2).
All ports can be full tunneled over ZT2.
You know if you youtube Zscaler, there is a wealth of information.
Yeah, I tuned out as soon as I read that line as well. If I was paid for every single time I had to listen to some IT person complain about how they had to do things differently than they used to in a new network environment with Zscaler and then immediately show them that there is a way easier than it used to be, I wouldn’t be working 70 to 80 hours a week.
Your question is a bit like asking “how does Microsoft work?”
I can be more specific, but this is one of those deals where each layer takes us in a new direction to the next question, so the open-ended “how does it work” seemed to make more sense.
Okay specificity, then. Starting near the beginning:
In a ZIA scenario with TLS decryption enabled, I type curl https://somedomain.com
.
curl
is going to run gethostbyname()
(whatever the modern version of this is) to invoke the system’s resolver function, right?
- Let’s imagine the
nsswitch
logic (I’m sure macOS uses a different term) leads us to DNS: Do we ultimately produce an IP packet containing a DNS query?
- Does that query escape onto the LAN and go to the DNS server suggested by the DHCP lease?
- Will the address of
somedomain.com
ultimately resolved within the curl
application be recognizable to the admins of somedomain.com
?
I’m really interested in how this stuff works, and all I’ve gotten from corp IT is outcome-focused what stuff: “there’s a tunnel”, “closest datacenter”, “we have policies”, “we can whitelist your destinations”, etc…
It’s frustrating because I can’t tell (and they don’t seem to know) whether “we can whitelist” has a dependency on making a DNS request or sending an SNI header. Both of which I’m not necessarily going to do, so how can domain-specific policies possibly be applied?
Ultimately, I’ll be able to figure out how it’s impacting my traffic once it’s live on my system, but I’m anxious about how disruptive it’s going to be, and they’re not inspiring confidence.
a good ZScaler fanboi knows that you should list all “safe” sites and resources first
It seems like that community knows the Zscaler admin GUI, but not how it works. I mostly came to the wrong place with my questions 
The answer, I’ve concluded, seems to mostly come down to “routing”, which is obvious, but I’d expected more. The marketing folks go to such lengths to distinguish this product from a VPN that I’d expected to find a socket API wedge or some other magic.
Thank you!
Is traffic diverted into the new tunnel interface based only on destination prefix? Presumably I’d see this with netstat -rn
?
Ultimately, the question here was “how does Zscaler get between clients and servers”, and a routing change on the client is an obvious way forward.
I’m having a little difficulty squaring that obvious solution with the Zscaler marketing which goes to such lengths to distinguish itself from “mere” VPNs.
I mean… It sounds like we’re talking about a multipoint cloud-hosted VPN concentrator with a TLS-cracking traffic inspector at the far side.
Is that really all there is to it?
The marketing led me to expect something more magical, perhaps with an LD_PRELOAD to redefine socket API calls and whatnot.
OP’s attitude is completely standard and ubiquitous among software devs in my experience. We’re so often in the position where tools we write or rely on get broken by some change in IT policy that doesn’t get communicated out adequately or with enough technical detail. We’ve become bitter about the entire process and the way security teams in particular operate. Virtually all of us use terms like “corporate malware” because our experience is just so bad. You should see the level of anger that occurs in our teams chat when secops fuck us over yet again and everyone knows they’ll have now have to do a lot of work to get their work flow back to what they wanted.
All that said, in the specific case of zscaler, I’ve had relatively few problems. We’re just wary in general. (I say relatively few, I have had to buy a new router for my house with my own money because zscaler + teams + my old router caused an issue that my IT team and zscaler support couldn’t/wouldn’t help with).
Yep. You’ll see a bunch of routes in the routing table pointing to the utun interface.
I should note ZPA operates a little differently in the sense it has its own separate tunnel to brokers - which connect to your connectors (AWS for example) and then the destination app.
ZCC inspects dns in order to intercept and answer ZPA apps with 100.64 and that’s how it goes to ZPA instead of ZIA.
ZCC inspects dns in order to intercept and answer ZPA apps with 100.64
Ah. This is more in line with the “magic” I was expecting.
Are the listeners in the CGNAT (100.64) range local to the endpoint, or do they live on the other side of the tunnel?
If the client sends something not-inspectable to a DNS name which runs other inspectable services (we SSH to a webserver), and the policy says to permit it, will that traffic still be tunneled but not inspected? Perhaps this behavior is configurable?
AFAIK the 100.64 is just local to the specific endpoint. Then it gets sent off to Zscaler.
In the case of ZPA, a destination app will see the IP of the app connector - so it is effectively a proxy.
I think the term the experts like to use is “micro tunnels”.
Basically your client connects to Z, the app connector connects to Z, which bridges the two together - and then off to the destination app.
SSH would just source as the ZIA data center for the user.
You can choose to block certain ports (firewall) if you so choose.
Edit: regarding ssh - it’ll go over ZIA (like anything else) unless it’s sent over ZPA.
Also - ZPA by default isn’t ssl inspected. It can be, but it involves sending it through ZIA - I believe the term is called SIPA (source ip anchoring).
In my time using Z, we’ve had most of our issues with ZIA data centers having slowness - so I rather not put ZPA through that too - but rather keep the two separate. 
AFAIK the 100.64 is just local to the specific endpoint. Then it gets sent off to Zscaler
Cool. This was my expectation.
So, a very fast (local host) TCP handshake with an on-the-fly-generated TCP listener which then does at least a TCP proxy, and possibly dives further in.
Guaranteed to break application which use TCP-AO (RFC 5925) or TCP MD5 (RFC 2385)
You mentioned “ZT2” earlier. That was a super helpful breadcrumb. Thanks for that too.