‘m going to discuss the web security model and the goal is to tell you a bit about the client sites security policies on the web.
So what is the cornerstone security policy? How does it work? How can you leverage it to build secure applications? So I will work with some of the examples. I have an aptly named example.com application, which has several parts that are important or need to be taken into account.
First, you have a public part where you have some public information. Every website has that you have an account management part where you have. You need to login securely things like that. You have a private customer area again, a sensitive part where only your customers are supposed to be where you definitely don’t want anything to go wrong and you have a public forum that has different properties because it’s publicly accessible if has public content, but still you don’t want common repa text to happen there, but it’s definitely less sensitive than the other parts of the application so you want. Maybe you want some kind of separation in between then you have some analytics code, which is present on almost every website.
Today we want to integrate some location information and some Twitter integration as well, where you have some a gadget of Twitter and all of this doesn’t sound too complicated. So how often, how would we do it? Let’s say we have a browser on the left and the back end on the right. We have some components that are deployed in the back end. We have the private customer area, you have the public forum, they use.
A shared backend sounds reasonable. You deploy it well, you deploy it and it’s loaded by a web browser, so he loads some page content from the private area. It loads the external components that are integrated there, and you expect that they’re some kind of isolation available. This is loaded in a browser, but not much can go wrong. These components are loaded, but you suspect, it’s a gadget.
I integrate, so it’s probably isolated. Somehow it can’t come too much wrong. Similarly, you have the public forum, it’s loaded in a browser. It should be separate from the rest and even if there is another tab open or another window open with some malicious website, you hope it cannot influence your website. But unfortunately, if that was the case, I wouldn’t be here and everything would be solved.
So I’m going to discuss with you step-by-step what does happen in a browser and why this is a problem and then in a second part, will go look at solutions on how to address this problem. So, first of all, if you host these on your traditional example.com domain, if you don’t take any separation into account, then these things are not isolated, so they’re definitely join together.
So if you have a problem in the forum, your private customer area will suffer from it. Definitely not what you want. These policies depend on the origin, so everything hosted on there under this origin will be the same and it will have access to everything on your website. So that’s definitely something you want to avoid.
Second, the external code: if you include it the way it’s often included today, then there is no isolation, so you include a script from Twitter and you have this nice Twitter box. Everything seems right, but in practice, there is absolutely no isolation. So the Twitter content has full access to the page.
You will load it here and even worse, it has access to your locally stored data. It has access to the permissions that are associated with your domain. It has excess APIs that can make requests to your back-end, with you not being able to distinguish this request from legitimate requests. So again, no boundaries which are problematic.
A third problem is that these scripts, that you include typically come from a remote host. So you have some code hosted by Twitter. You have some code hosted by cool and you include it not per se a problem, but by doing so you kind of trust, Google or Twitter to do the writing. Well, they probably do the right thing, but you also trust them to not be vulnerable against the text, because if an attacker can inject code here, it’s loaded in your website and you become vulnerable and to illustrate this – and I have a few examples here – you have A q-tip to a library, it’s a jquery plugin, you can download it from the website and apparently, at the end of 2011.
The code was compromised on the repository itself, so everybody who downloaded it in the period of 32 days between these two dates downloaded the library, including some malware. So you include that in your website everything works fine. But in the background, you actually loaded malware, which started connecting to your servers on the internet, and this was discovered by bug reports of users saying like okay, I’ve included this, but now I’m making requests to some weird IP.
Imagine that you didn’t have to download this, but include it directly on your website. Then one compromise of one code, library, and you’re instantly compromised as well. A second example, colleagues of mine did a large-scale study of the top. Ten thousand Alexa domains, basically the ten thousand most popular sites on the internet, and they looked at how many scripts are included. How large is this problem? Is this a problem or not and well, you can derive it from logos here.
This is probably a problem and eighty points: forty-five percent of these sites actually includes remote scripts. So I’m 88 points forty-five percent of the sites trust a remote host to be not compromised when they include scripts. Here’s a histogram plotting the use of a remote script host. So here it is it a percentage of sites and he ran the incremental number of remote hosts that they trust to include scripts from so you see you the beginning, that’s about seventy percent of the sites.
You are between 0 and 20 different remote hosts to include scripts from which is already quite a lot of dependencies on external service providers. To do the right thing to provide correct code to not be compromised and one side managed to even include scripts from 295 different remote hosts, so definitely not recommend using that site. If you are a bit security conscious, additionally, they define some measurements to see whether a site is secured, conscious or not, whether they apply clean, clean technologies, whether they configure their technology, say they deploy securely and based on those numbers.
They concluded that twelve percent of the sites – that does the right thing that takes extra measures to make things secure, do include scripts from hosts that do not take these measures so, and these twelve percent of sites are basically very vulnerable by compromise through an external script Provider, so this is definitely a problem in the web today and it’s definitely not a solved problem.
The next step is, we probably want some secure connection, for our own website seems reasonable to you, after all, you have authentic Asian information. You have login credentials, you have session cookies, so we deploy HTTPS standard practice. Doing so also means that if you include remote goat that you need to do it over HTTPS as well, otherwise you get mixed content and mixed content leaves you vulnerable for network attackers again.
So, even if you do the right thing, you take all these measures you deploy HTTPS you protect against these attacks, including one library and with an attacker in the wrong place and you’re again effort for nothing you’re compromised.
So a final problem in the web today is that, even though there is some isolation at the client-side enforced by the same-origin policy, for instance, where the attacker cannot directly access your website, nothing prevents the attacker from making requests to your backends. This is how the web works.
This is how you include an image from some other side in your website. That’s this request, but since you have cookies here, the browser happily attaches these cookies. Do the other requests as well, and then you have an attack called cross-site request, forgery to probably heard of Jim mentioned it yesterday in stock, see surf or cross-site request. Forgery in full happens everywhere. Facebook has it. Google has it, of course, they fix it fairly, quick, but other sites have it as well.
You have changed email forms where you can change the mail address for your account with see the surf, so an attacker can change your mail address. Then reset your password and take control of your account. Your home router is probably vulnerable to see the surf. These devices are notoriously insecure and because yeah, who is going to a cue from inside your own network but visit the website and the website, tries to change the DNS servers na mahal on your home device. If it’s the phone was AC surf, it succeeds and your DNS servers are changed to the attacker ones and he has control over all your websites you visit. So again. This is an important problem.
There is protection and against it, but you have to know about it to build it in and once you build in, then the problem is fixed, but today on the web it isn’t fixed. So that’s what you’re here to learn today. So what do we want? This is the application that we do want. We want our client-side components to be isolated from each other, especially if they have a different sensitivity. For instance, a forum is less sensitive than the rest, okay, isolated, don’t let them influence each other. We want secure connections to be used and we want them to use for the other content as well.
We want the external code integrated into our pages to be isolated or at least restricted. We want to be able to say you can do this in this, but not nothing else. We don’t want to sniff around in our local storage facilities or whatever, and we want potentially malicious context when they make requests. We don’t want to influence the back end. So we want to well, you can’t prevent them from making them police, can detect them and say like okay.
This is definitely not something we want so we’re going to prevent this or ignore this we’re going to tackle these problems step by step. So the first challenge we’re going to take less compartmentalization, so we are going to see how we can leverage the same-origin policy, which is a security policy.
The cornerstone security policy of the web has been there since 95. I think you can see how we can leverage this to effectively isolate components. We can separate them within the browser. Of course, if you separate them, you need to share some stuff. You want to share authentication information, so we don’t want the user to have to relog. In on every part of our website, so we want some way to share the authentication among the website and also share some information.
If you want to request some account information from another part of the site, we want to be able to do that. We want to use a third-party code, we want to do analytics and Twitter and everything associated, but we don’t want mixed content warning so we’ll have to solve that or address that in some way and we want to be able to manage the risks associated with the Code, we want to either trust a trustworthy provider, for instance if you include something from google if you’re aware that this is a risk, but it’s a risk you’re willing to take them. That’s fine, but if you don’t want to include code from some shady website somewhere on the web, soda, or solutions, you can apply there and finally, we’re going to see some communication mechanisms for the back end.
The same-origin policy is based on these origins. Maybe you know the same-origin policy. Has a thing in the browser that says gives you an error when you do something that violates it, that’s possible, it’s you see it as Dom exceptions or explicit same-origin policy exceptions, but what it basically says is that if you have content that comes from One origin: you can interact with our content from the origin, so if you include, for instance, an iframe from the same origin, you can interact with it, you can access it, you can modify it, you can do whatever you want, but if it comes from an older Origin then it’s restricted.
This is important because otherwise, you could include iframe access, it inspects, it extracts a password from it, and things like that. So that’s definitely not something we want and that’s.
There may be some crooks here and there, but in most modern browsers, these are resolved and everybody enforces the same policy and it allows you to separate sensitive parts from nonsensitive parts if you put them in different origins. So you have these. This policy that separates them well uses the origins to effective, effectively separate them.
It can prevent the unintended sharing of information. So if you have separate origins and information will not be shared unless you wanted to, and it also prevents escalation of an attack. If your forum is compromised since the other part of the site lives in a different origin, then it won’t be that easy to compromise.
The other parts as well so a small example. Here we have the forum part, which includes an iframe from the private customer part, and in this case, they have voted to say both have the same origin, so they can freely interact. That’s potentially a problem.
Additionally, let’s say that the private part uses data store in the browser, for instance, index DB. If you know it it’s a sort of database, like a storage system, you can use it within the browser, so you can store some data there. You also have web storage, which is simple, key-value, based storage in the browser, if the private part uses this it’s stored under the origin of example.com, so which basically means that the forum site can also use this.
This may not be what you want if a forum doesn’t need data, then stone story there. It doesn’t need it. So what would change if we do it like this? Here you have a forum lot example on a private, example.com, different domains, so different origins. They cannot interact anymore, not freely. At least we will see later how they can interact and if the private parts store some data now it’s stored under its own origin and there’s no way for the forum to act to actively access this data store. So again you get separation from the same-origin policy, so you have two choices to do this. You can either use subdomains or domains themselves.
I’m going to go over both options, so subdomains basically have the same parent domain, which is example.com here. This has some advantages: it’s a different origin. So that’s a good thing, but some relaxation mechanisms exist to make them same-origin again, so they can both relax their domain to the parent domain example.com and using the document dot, domain property, and from then on and they’re considered to be same-origin again, and they Can interact for domains? let’s say we have a private example.
Come to an example: 4 m.com, two completely different domains, and this cannot interact, they cannot relax, they cannot share cookies or I forgot this and so with the subdomains. You can also share cookies on example.com, so you can set one cookie. That applies to all subdomains. You can’t do that with different domains. So that’s a trade-off where which one do you want to use one example of subdomains here, let’s say we have www.example.com the private one, the 401, and the account, in this case, nobody can interact with anybody all different domains. So different origins and nothing happens there.
So if the forum now decides, oh, I want to relax my domain to the example of comments. Well, it can also interact with these contexts and that’s maybe not what you intended. So that’s definitely something to keep in mind when deciding how to compartmentalize your application. So, let’s see, how can we compartmentalize?
We have the four components here and here we’re gonna have some questions that are helped gonna help us decide what to put where so. First of all, does it have sensitive contents and well? The public part doesn’t have the sensitivity that comes with the forum. Neither I mean it’s a public forum, so no real sensitive content there, but these parts do have sensitive content, so probably want to separate them.
Do you have authentic Asian, which means credential session management, those kinds of things well, the public part doesn’t but arrest us, so this implies that there will be some sensitive data transmitted back and forth, so you probably want to deploy over HTTPS for the public part. If you have the option, of course, you deploy our HTTPS because you gain a lot of security, but for the sake of this example, I’m not going to do it. So I can talk about some interesting interactions between HTTP and HTTPS.
Do you need cooperation between domains? Well, these two don’t, but these two want to cooperate. For instance, if you have a private customer area, you want to retrieve some account details to display in a status overview, or I don’t know what it’s an example application so fill it in as you want, but they want to communicate or interact anyway. So to deploy we’re going to put this on the overall domain in HTTP, all the rest will be HTTPS and it will have its own subdomain, except for the forum, which will become completely different to me.
So this will, for instance, allow us to use domain relaxation or domain-based cookies here without a forum having access to it. Okay deployment twice: what does it look like? Well, we have the back end. We have our four components. Well, I don’t really care about this part. This is more actual medical stuff who will which you will talk about in other sessions. So don’t worry about this we’re worried about this part, so it’s deployed in a browser because there are all different origins, they’re kind of isolated within the browser. So that’s a good thing. You can achieve leverage the same-origin policy to achieve this.
These parts need authentication. So we also want to deploy them over HTTPS, which will have some consequences later on, and finally, we want some interaction from the private customer area to the account management part and how we will achieve that will be dealt with later on alright, by the way, If there are any questions, I finished the first part now, if there are any questions you can, let me raise your hand and ask them.
I do my best to answer them. So now we have different compartments. The next step is, we want to have authentication on these parts without having to redo it every time, and we also want to exchange some information between de tus. These two contexts, so first authentication on the web exists of two steps. First, you need to know who the user is. So that’s the entity authentication part, and next once that is done, you want to maintain a session. So, as you all know, HTTP doesn’t really know which requests belong together.
So we use something like session identifiers to know, okay, these and these, and these requests belong together and they’re associated with dedicated identification States. So we can deal with the request appropriately entity – authentication – I’m not really going into detail on this, because it has little influence on the security policies within the browser.
So, basically, today you exchange usernames and passwords, and this is probably not the best practice but the most commonly used. You also have challenge-response systems like with the Belgium banks and their tokens, and it’s one good example. You have clients out certificates which are on your Eid. So if you use that to log in on a website, you use a client-side TLS certificate, and then you have session management.
Defecto session management is cookie-based. So what happens? If you have a session identifier you put in a cookie, it’s transmitted on every request and you associate a request with the session, which is quite important because this cookie is actually a bearer token, which means if you present the cookie to the webserver is going to assume that you’re the legitimate holder of the session. So if somebody manages to steal that cookie, he can impersonate you because the webserver doesn’t know that. It’S not you so very quick.
How does this work? You have a browser and server, and the browser sends the first request for some page on the server seed is coming in once to establish the session but doesn’t really well there’s no session cookie or anything. So he doesn’t know who you are so in the session store, you will create a new session with a very random identifier.
How you should actually generate identifiers is on the wasp website, there’s some sheet with the whole list of recommendations. Here it’s a non-random example and it has some authentication state which is false, so no user is authenticated. The response contains a header set cookie, which contains this identifier and in the browser, this will result in a cookie being stored for this domain.
So it’s the domain that belongs to the request and this is stored in a browser so on every subsequent request this school you will be sent if the request goes to this domain so again request to this domain. Oh yeah, I have a cookie for that. Let me attach that which allows the server to look it up. He doesn’t really care about this page, so just a response and which triggers the actual login request with a username and a password. For instance, again, the key is attached allowing the server to retrieve this session. It checks a username and a password. They are valid for the user Bob, so it switches the authentication state to true and uses it to Bob. So every subsequent request will allow the server to look up this session see okay.
This is Bob that authenticated, so I’m going to do action X or Y or whatever that’s how cookie-based session management works. What is important with cookies is: what can you do to modify the behavior of cookies? There are a lot of flags well, four or five. Actually, that has an impact on cookie behavior. The first one is the domain, so um domain attribute allows you to set a cookie for a parent to me in our example.
For instance, if you want the cookie to be used in all subdomains of exam calm, you can set the flag for the cookie to example.com and the browser will not only sent a cookie to www.example.com but fool our bar or accounts or private or whatever, not To example, 4 m.com, that’s a different domain, so it doesn’t relate here. A second attribute is a part and which we don’t need in our application when I’m going to cover it because it’s important to realize something about the part. It allows you to limit a cookie to a specific directory within a domain.
So, for instance, let’s say you have user domains? For instance, talents have these websites for its customer, so it uses users, though till net or com, /, young and biffed and whatever. If they have an application there, they want to set up out on their cookies because they only want the cookies to be centered. Their application, ok seems fellows. The problem is, however, that it conflicts with the same-origin policy. Why? Let’S say I am the attacker who lives under the users domain and I want to steal the cookies from an application. That’s under the victim directory same domain, but the victim.
If you send a request to the attacker, the cookies for victims will not be included. Why should they’re restricted to the pot, so the attacker includes an iframe and loads? A page from the victim application seems fellow. They are of the same-origin. Ok, so he can interact with his frame and he can within the frame you can access the document and from the document, he connects is the cookie attribute. Now, this is within the victim application, so the browser will return all the cookies belonging to this domain and where the path matches the victim. So in this case, the attacker can successfully steal the cookies from the victim application within the limits of the same-origin policy.
And that’s because there is a mismatch between the same-origin policy, which is based on origins, and the cookie path attribute, which is based on the parts of a cookie. So, shortly after they introduced this attribute, they added a line saying that this is not a security property and you can use it for practical purposes or to avoid conflicts in cookie names, but not as a security measure. Good.
So in this example, you have the attacker which lifts on the attacker’s calm and he wants to UM. It’s a network attacker, so he sits on the same wireless network. Seo he can sniff your plain text, HTTP connections, but don’t worry, you have HTTPS, so everything is fine, but what it does is it tricks your browser into sending a request to secured example, gone which is supposed to be HTTPS, but he uses. The HTTP scheme seems quite harmless, but if the browser has cookies for this domain without a secure flag, he will attach them to this request because they’re not secure, so why?
Why should this matter – and this allows an attacker to steal your cookies from the network which effectively leaks cookies? That should be secure and to the network. Therefore, if you have any cookies that are used on HTTPS connections at the secure flag, again simple, simple solution for a lot of troubles that can arise. Additionally, if you have a mixed, HTTP and HTTPS environment, you might figure okay. I can’t use the secure flight because I need two cookies on HTTPS well, but that’s definitely not the right approach, because if you have the HTTP connection, then the cookies will leak eventually and will also compromise the HTTPS part.
So the good approach here is to a source associate different security levels to different cookies. So you have a separate HTTP and a separate HTTPS session identifier, which can eventually point to the same session. But for every security, sensitive operation you require HTTPS, and if the HTTPS cookie is not present, which is the case with an attacker who can only steal the HTTP cookies, then you will not execute the sensitive operation. That’S essentially how a lot of websites nowadays work.
For instance, Amazon, the shopping cart is HTTP and it’s vulnerable. Somebody can hijack your connection, can add stuff to your cart, but once you check out once you change account details, they switch to HTTPS and if you don’t have the appropriate cookie, they just ask you to log in and frenetic or it’s game over by then good.
What does it mean, for example? Come we have the need for cookies? Basically, here where we want session management, so everywhere various authentication for these two domains. They have the same parent domain and we choose to deploy the cookie on the example.com domain, so it will be sent to every example.com subdomain.
But we add the secure flag and HP only flag, but the secure flag will limit this to HTTPS connections, which means that the public part where we don’t really need a session cookie, which is not deployed over HTTPS, will not see this cookie because it’s secure mark The secure, alternatively, if we want to share a tent occasion with a forum that might be a bit difficult, because we can’t set a cookie that applies to both so the correct approach to do. There is to have a separate cookie for the forum website, but you can associate the same session with this cookie. So what you typically would do, that’s how Google, for instance, does it single sign-on?
If you log into google and once you’re authenticated, it starts sending background requests to its partner’s websites like youtube and whatever, and it set basically says to youtube. Okay, this user is authenticated, here’s the session you can use. So when the user visits you and he has this session ID, we actually know that it’s associated with this internal session – and that’s the user, Philip for instance – and that’s, actually quite a good approach.
So you can share a tent occasion without sharing the session identifiers. Basically, so that’s for the authentication part now we want to see how we can achieve other interactions, except for authentication and the interaction is kind of limited because of the same-origin policy. So what is allowed? You have to relate that context, so you have, for instance, if you include an iframe, you can get a reference to the live frame.
You can ask some things to the iframe. So is it the same-origin? You can access it completely if it’s cross-origin. You are very limited. Another example of a related context is, if you open a pop-up, you get a reference to the window and you can, for instance, alp to renew to navigate to a URL or something like that. So one interaction that is possible is navigation. So you can ask a child frame or the top frame, for instance, if you’re getting at a reference to the winner, we can ask it to navigate it, so you can send it to a certain URL.
This is useful if you want to have some automated navigation somewhere if you want to resend well reload the page in the pop-up and a second interaction that is possible art using exposed API‘s – and this is useful for us, because one of the API sets exposed is Web messaging – and this allows interaction between frames, so the web messaging API am is basically a mechanism that allows you to send a message to a window or a different context which can choose to receive the messages, an opt-in mechanism. So if it doesn’t care about messages, it just doesn’t accept messages and once it processes the message, it also has a reference to the sender, so it can send a message back.
First, if you send a message, you tell the API, I’m going to send a message to private examples of conferences, so you specify the origin. This is important if you have an iframe, attacker managers to navigate his iframe to his own website, and then you send a message. Then you leaked information to the attacker, which is not what you want, so you specify the receiver. The browser checks this. If it doesn’t match with what you specified, and won’t deliver the message.
So you are certain that if the message is delivered it’s delivered to the right party, when you receive a message, you want to check where it came from because everybody can basically include your application and start sending messages to it. And if you accept everything and just, for instance, execute requests on everybody’s behalf, then things will go well, we’ll be going wrong. So you get orange on the property. You can check. So you can check if you receive a message. Oh, this came from a gums accounts that example.com.
Yes, I want to accept this or if it came from evil bastard com. No, no! I’m not going to accept that. Second, if you accept it – and you want to do something with it – it’s best to check what you actually got so, for instance, if you have an API that accepts a message and creates a request based on that message, then you definitely want to be sure that You’Re not for neural to any injection attacks. This is a new API and people are looking into this kind of problem and apparently a lot of sites have deployed this.
Let’s say you want to store some data in the users browser, but you want to remain in control over who gets access to the data I, regardless of which subdomain want to exit. So what you do is you define a storage subdomain, which is basically a component that only does storage so there’s no interaction, there’s no, no other pages, nothing. It has this.
This data store starts with its origin in the browser, so you can choose the store you want. Let’S say it’s index DB, so it offers some way to store data. Now every call every component of the application that wants to store some data and wants to be able to access it later on just includes an iframe of this component and uses web messaging.
To ask like ok, I want to read some account information for this customer and then the storage API can decide whether it wants to allow this. If you are the accounts application, then it can say: okay, yes, you can read this, and here is the data. If it’s the public forum, for instance, it can say like: oh no definitely not. You have no business with the account information, I’m not allowing this. So basically it allows you for content, inspection, and access control, which you would otherwise not get it’s just a nice application.
So practically, how would we deploy this? Well, we the Nutri interaction. Well, basically, we can include an iframe frog from the accounts part in the private customer area and we can start sending messages to it using post message and we can get information out of it. Alternatively, we could have used a document domain to relax both origins and interact with them directly, but as stole before that would open up this origin for other contexts as well. So that’s, maybe not what you want. That’s why we chose web messaging. Part all right now we have covered the mechanics of your own applique of our own application.
The images are simply put on the page where there is a placeholder and they are displayed and that’s about it. Images are not active, they come to anything wrong and the only thing that can leak is basically the size of the image for the, including page. That’S about it. What can go wrong here or what are the problems with including content? Well, we had a mixed content problem, which is when we have an HTTPS site and we won’t include resources.
What can exactly go wrong? So let’s say we have a browser. We have an HTTPS page loaded and the page includes some scripts and style sheets and whatever so, what happens when the browser process is his page? He sees okay, I need a script which is an HTTP script, so I’m going to request the script and the server is going to answer with the file. That’S y, a straightforward happenstance, thousands of times when you’re browsing, but if you have an attacker that sits on a network, he sees this request going out and he’s like. Okay, this is an h3r resource for an HTTPS site. Interesting. I want to get hands-on on that side.
So let me quickly respond with a compromised script before the server does and the browser expects a script coming in, so he accepts the script as a valid response. By now you have the attacker script running in your HTTPS page because the browser parses it unloads it, and this is again very problematic and that’s exactly why mixed content is such a problem and why it’s also being handled in the modern rep. So how do you solve this? First of all, browsers are blocking mixed content. Inclusion, surprisingly Internet Explorer started with this, so they started giving pop-up warnings when you have an HTTPS set, including HTTP content, and today other browsers are following as well.
So this is something that’s being addressed and that’s also kicking developers for not having fixed their site. So you definitely don’t want your user to see this warning every page he loads. So that’s definitely something you want to address. What can you do? Well? First of all, include HTTPS resources, but that’s easy easy set easier said than done, so you can use either you can localize these resources. So if you have this HPS script, HTTP script that you really want to include once one solution is to just download it and put it on your own web server.
This yeah, it’s not the most well gallant solution, but it solves the problem because it’s a script, it’s loaded from your web server and use HTTPS anyway. The only event disadvantage is that you need to keep it up to date. So if there are remote script changes, you need to update your version as well.
Script-based integration and really simple, but it actually violates the security boundaries of the web page because it’s included in your security context and it makes you vulnerable to the script. The inclusion of X iframes can leverage the same-origin policy. So if they have a different origin, then they’re isolated, but they do preserve their security boundaries, but they hinder interaction. So there are cases where Iframes are not that’s easy to deploy.
For instance, in our example application, we had a Twitter gadget, which kind of stands alone you can put in an iframe sounds good, but the analytics code really needs access to the page. It needs to be able to have handlers to different elements to inspect what the user is doing. So putting that in an iframe will give you really boring analytics results.
So it’s a trade-off and there are no real good solutions available yet so for scripts. It has full access to the client-side context, which is not what you want, but it’s yeah. You have little choice and there are a few existing techniques to constrain scripts. But as a disclaimer, I kinda have to say that none of them are really practical.
Of course, if your advertisement provider doesn’t really like that, then what can you do about it? There’s browser-based sandboxing. These are academic techniques, but there they work really well. They allow you to enforce the security policy on the script, so it’s basically an isolated module, but it requires modifications to the browser. So that’s kind of deployment, nightmare, server-side, rewriting and it requires control of the scripts because it rewrites the script.
So that’s definitely the way it’s going and the way the solution for this problem will be in this displacement. I frame-based integration. That’S the other approach I said before they controlled by the same-origin policy. So they allow you to have some isolation to contain it nicely and they are well suited for separate components. For instance advertisements.
They typically have like this square space somewhere they’re. Integrating them with iframes is a really straightforward solution. If they don’t need access to the page, you can put it in an iframe and be done with it. Additionally, html5 introduces more security controls for the iframe, which will probably be covered in leave-in stock on Thursday, and it’s a sandbox attribute and what it basically does is. It allows you to define additional restrictions on the iframe, so, for instance, if you enable the sandbox by default, the iframe is not allowed to run any scripts. It’S not allowed to run any plugins.
It’s not allowed to submit forms and there’s a whole bunch of other restrictions that are available and you can relax the restrictions one by one with keywords, so you can say okay, I want to prevent all of this, but I want to allow scripts to be run. So you can get a level of granularity for the protection that you need. Additionally, they support a unique origin. So basically you can tell the sandboxed iframe to reset its origin, to something unique that will never occur anywhere in the browser.
So basically, this completely isolates the frame from any other document, so there’s no interaction with all the documents. There’s no origin relaxation or whatever it’s stuck with what it is. So this is well suited for the integration of untrusted content, for instance in our forum. We might want to deploy this.
Every message is used, provided it’s untrusted inherently, so we can, for instance, use an iframe, a sandbox, iframe and display in that iframe the message from the user. You don’t see this in the layout. You can make it appear very nice, but you can disable scripts. So if the user then manages to inject scripting scripts, pastor XSS filter, it’s still disabled by default, so nothing can happen if you’re really sure that you don’t have any other input anywhere on the page.
Maybe but let’s approach it as a defense in depth. So it seems the best approach, so this is definitely one of the problems that are not really solved. So what are the best practices for integrating third-party code, if possible, put it in an iframe, have a strong separation, boundary and you’re good for this? So if the use case allows you just put it there and sandbox it if necessary, so that’s one approach.
If you have to include script, do it from trusted providers and be aware that these are dependencies in your application, so that you depend on the other party to, first of all, be available, be not compromised. Have good scripts bill did not abuse you, for instance, if Google tomorrow decides it wants to take over all the websites. Well, basically, eighty percent of websites include Google code, so they kind of are able to do that and again as a counter-argument.
If you want to use a mirror for libraries, google has several libraries mirrored and another approach is: if you get, if you have really crucial applications, for instance, if you’re building an Internet application where security is really important, and you don’t want to include outside code, You can localize the code on your own web server, the guys that did the large-scale script.
Inclusion study also looked at changes that occurred in scripts and they concluded that a weekly update schedule should be fine, so um if you’ll have it in your development process and once a week you check all the libraries. You check the differences. You perhaps you do a quick code review to see. You are changed. Probably it’s harmless, so you can. You can quickly update a library so for our application. We now have these gadgets that need to be integrated well for the analytics. We do it with a script.
We use Google Analytics, so we kind of hope that they’re secure and they know what they’re doing and there’s no real alternative here because it needs access to the page needs we integrated the whole of the page, the whole of the Dom tree so there’s little isolation. We can do and for the Twitter gadgets or any other social media buttons. They are really good for this use case.
You can put them in an iframe, so you just frame them and integrate them visually with the page and there’s you don’t really notice that it’s framed, but it allows you to put some boundaries there. Okay, now we’ve covered this. So the final topic is the interaction between the client-side code and the backend, and especially the problems with the cross-site request, forgery, and other and illegitimate requests so interacting with remote services. It’s the web. So it’s fairly straightforward.
These two will be the focus of this section recently. In the past few years, additional technology has been introduced, so you have web sockets, which basically allow you to upgrade your HTTP connection to a soccer channel. So you can send arbitrary information without having to adhere to the HTTP protocol. You have web RTC, which is real-time communication for the browser it’s used for video calls out of your calls.
I think somebody even implemented peer-to-peer communication with it. So it’s a lot of new protocols but they’re being actively developed, so there’s little usual useful to be set there regarding production use. What are the challenges here? Well, we have difficulty determining where account a very request originated from.
So, that’s definitely something we need to know at a server-side if you receive requests, where does this come from? Can we trust this? Is this what we expect and we need to know whether it was intended by the user and that’s the problem behind see surf? So you need to know is this something that resulted from action from the user? Or is this something that was created automatically by an attacker? And that’s two problems that need to be addressed here. So first HTML based communication and you can trigger different kinds of requests.
You have got requests which are the simple requests without body coming from images, scripts, stylesheet inclusion, things like that, and you have post requests which are typically coming from forms and they have they allow. You some control over the body content, so we can define some parameters.
You can define the values of the parameters and play a bit with that to get a certain format if you want to these requests are not really affected by the same-origin policy. So that’s how that works. You don’t get access to the naked response, so it browser process it for you and therefore determines that. What’S the worst, you can do right. So even a Texas cookies request because otherwise, the server doesn’t know the state doesn’t know to associate it with the user.
So problems arise from this like cross-site, request forgery. Why? Because the cookies are implicit authentication, the server implicitly assumes. If the cookie is there, it was part of the session so I’ll execute it as part of the session, regardless of whether an attacker or is made it go out or to use a minute go out. Of course, there are legitimate use cases as well. If you want to include parts of a Facebook profile in your web page, then you will typically send a get or a post request which gives you the data to integrate.
So it’s a difficult trade-off a bit more on sea surf quickly schematic of how it works. You have a browser. You have a server, for example, comb, which is a good one, and you have a gallery of calm, which is also not internally malicious, but’s compromised by an attacker. So you have an authenticated session. This happens. All the time doesn’t require you to have this site open anymore, as long as there is an active cookie in the cookie jar of the browser. This is the case in another tab, another window, and doesn’t really matter.
The browser goes to a compromised image gallery. So the user wants to see some images, probably of funny cats on the internet. So he receives this page with images and he looks at it. But in the background, there’s a see, everted embedded in a sea server tech and it can be completely invisible in an iframe. So we don’t even notice this. But what happens in the background is it sends a request, for example, calm, and it’s specifically targeted at this site, so it creates a form which has a Stargate change, email, which is some script at the server-side, and it has some values which eventually include The email address of the attacker and they want to change email address rigid to the attacker, the server here and it’s a simple, simple web server.
The simple application doesn’t really know about see the surf, so it just gets requests with a cookie from the user and it executes this. In the background, this is standard procedure on the web. The user, however, hasn’t seen anything happening, so he loads the page, and he continues browsing on the website and it’s only a few days weeks months later and he notices that email dress, code changed and the account got hacked. So what can you do about this? Well, as Jim mentioned yesterday, a lot of frameworks are starting to build in protection against this by default. So that’s a good thing and if you don’t have such a framework, you need to build this protection in yourself, there’s again, there’s an NOAA spreadsheet about see, surf and basically today there are two valid approaches to protecting against Caesar.
There are token-based approaches in the origin. Header, a token-based approach, basically assumes that you have a secret token and whenever that secret token is presented again, you assume it’s a valid request. So how does this work? You have a form. This is now about a legitimate operation on the website, so the user literally wants to change his email address, so you get a form that has action, change, email, or whatever and in the form, you embed a secret value. You just generate at this value.
You stored it in a user session and you send it in a response to the user um, which is rendered, and he sees the form he doesn’t see the secret value. Whenever the user now submits this form, the secret value will be submitted as part of the body, and the server will receive this and verify this token against distorted token and user session. If this matches, then he knows that this request came from the response. He served to the user before and he considers his legitimate request for the attacker and he cannot get hold of his dope and it’s essential for SI sirve protection and because of the same-origin policy. So basically, what happens is the request and the response is surf to the users browser is parsed and rendered in a frame or a document or whatever because it’s a different origin than an attacker website.
A second approach is the origin header um. This is also fairly simple. What this does. This is a recent proposal and but it’s implemented in most browsers, so basically a browser will include an Origin header on a cross-origin request. So you receive this as a server and you can check this header and to see where the request came from it’s. Basically, the referer header that uses used to be there or is still there, but it’s more privacy-sensitive, so it shouldn’t be stripped as much as a referrer. So if you see this coming in, you see a request for change, email and you check the origin, header and it says attacker com or HP dealt ethic dash, attacker com, which basically tells you ok.
This request was generated by an attacker and submitted from a user’s browser, but this is not an origin at the trust, so I’m not going to execute this request. Only if you see your own origin, you can accept it or if you see the origin of one of your partner websites or trusted websites. You can also accept this request. Yes, of course, so yeah you can modify this header because its clients I generated header.
So anybody can generate its own origin headers. So it’s not intended as a server-side Isis control, but it’s intended as a way the browser at ed’s, the header and within the user process the attacker doesn’t have control to forge it. So, yes, yes, but then again you don’t really need to see, serve to perform actions. No, the only difference is that it contains less information, so the fur header contains the full part and parameters and whatever this doesn’t only the origin and it’s the same as a referrer header in a sense that has the same purpose or same information, but the referer Header is often stripped because it contains personal information. It contains what file you were looking at.
What parameters are contains and therefore they introduce this header in the hope that it’s not going to be stripped by anonymizing proxies and things like that. So it’s actually the same control and people used to use the referer header as si sirve protection mechanism a lot of years ago, but due to continuous stripping and some forging problems, it was discouraged as a see surf control. But in a sense, it’s the same.
You can have it synchronously as synchronously and one of the important uses is you can use APIs, you can use JSON API. So, in the background, your request, some data from the server it gives you some JSON data, you parse it and you present it. That’S basically the Ajax design principle and it allows you to have single-page applications and all that stuff. So it’s really good at invention and how it works is.
Basically, you have some URL where you want a central request to you to create a new request with this constructor. You say what method it is and where it has to go through, go to, and you define a function that handles the response when it comes back and you basically send a request and the browser takes care of the rest.
The router makes requests gets a response back and hence you the response to processing the onload function. So there you can inspect it. You can. If it’s HTML, you have to inject it yourself in an inner HTML element or something. If it’s Jason, you can use JSON API to parse it if it’s XML, you can do whatever with it basically you’re free to do what you want. How does this work with the same-origin policy?
They want to be able to use a Google API to request some JSON data. They want to be able to use other cross-origin, AP ice, or anything, and it used to be only possible with some 30 hacks that were found by accident and not intended to be used. The problem, however, is that legacy server code does not expect such requests, and so, for instance, xhr going cross-origin allows you to make a cross-origin put request. But if you have an API that accepts put request or delete request, it implicitly assumes that it can only come from your own origin and if you suddenly start accepting it from everywhere, then you’re going to be in big trouble fairly soon.
One example is Facebook: they actually had this problem. They had some codes on the mobile version. If I’m not mistaken, that accepted a document, a URL, and loaded it using xhr, they did it assuming that’s its same origin. So what’s the worst that can happen and I get an attacker can load a different Facebook page. Oh, not really a problem, one might think. But if you have cross-origin requests, then all of a sudden you can include any page you want and it’s rendered within the facebook.com version, and it allows you to do a lot of nasty stuff.
So this is exactly one of the examples that you need to take care of and what developers of this API effectively took into account. So it’s the cross-origin resource sharing or course for short API, and what it basically does is it implements or proposed mechanism of using headers to enforce some controls to be able to that you don’t provide additional capabilities to an attacker, so in the tiger already has the Traditional HTML capabilities, and so it can, you can send get and post requests, but with the introduction of this API you should not be able to do more.
So that’s really important. They assume that websites existing now already take into account this problem or this capability, which may not be the case, but that’s the case. They assume. So, let’s not give them more with this API, and that’s that explains a lot of the design decisions in this API.
So how does this work? You make an xhr request from the client-side. This is basically, oh, it’s the next slide, and this is basically the same as before, except the browser will do some additional things in the background and the goal of this API is that a server when it processes your request? First, he decides whether he wants to or not, and then he tells the browser how to proceed with a response. So he gives a response and if it comes from an untrusted origin, he tells the browser like okay.
I really don’t like this, so please don’t give them access to my response, in this case, if an attacker succeeds in making a request that was not allowed, you will not be able to access a response. You will not be able to extract, see, surf tokens or whatever, so they really want to prevent additional attack vectors and they either use the policy I described before, or they use a pre-flight request and I’m not going to go into the detail about free flights, but Basically, they allow normal requests that look like HTML requests to happen and prevent an accessory response, but for really complicated ones. Like a put, you really delete request. You really don’t want to execute it first and then tell the browser. Oh yeah, I don’t want the attacker to have access to the response, because you already deleted some file, so therefore they first they send a pre-flight which checks with the server. Is it okay? If I sent its elite request coming from here and if the server says yes, you can do that, then the browser executes it.
The developer doesn’t notice is it happens in the background and it’s completely infeasible. This API was created for use with xhr, but it’s already used and auto rate the ice as well. So you can, when a resource is loaded cross-origin, regardless of xhr or not, you can add these headers and the browser will take them into account for certain technologies. For instance, you have a html5 canvas element that becomes security-sensitive when it contains an image loaded from another origin, and with the course headers you can relax the restrictions in the post there, but I’m not going to detail on that.
So how does coursework sending the request is? Basically, exactly the same as before one modification you can, you have to explicitly ask to use credentials if you want that. So that’s basically 11 attributes added. If you don’t want that, okay, that’s it that’s! The type of that should be right here. If you don’t want that, you don’t have to add it. So it’s fine what happens in response. So this is a simple request. Let’S say you request some API information, the server fetches. That information sends it back and then add some course response header. So it says, allow origin if this string here doesn’t match the version that effectively asked for the data.
The browser will automatically deny access to respond, so you will never see the data from the API and, if credentials need to be used for that API, the server will tell the browser whether credentials are allowed to be used or not, and if this doesn’t match with The use of credentials it’s again tonight and if you use custom headers, so if the application uses custom header, you can allow the script that receives a response to access this header by default. It can only access some default headers that are not security-sensitive, but since it doesn’t know whether your headers are security, sensitive or not, you have explicitly approved access to these headers, so a practical example of sharing an API. With course. How would you do it um? Well, first checklist here: if you receive a request on the server-side, you have to check the origin of the request. That’s rarely the origin header I talked about it.
See surf comes into play again, so it’s reused. If this origin matches something you want to allow, then okay, you can proceed. You can check the met. You can check the method, that’s used. If this is a get request and the API allows get requests, okay, fine. This is also the moment you want to prom some access control if necessary. If you want to prevent certain people from accessing it and our credentials, you can do it here. You can execute it and then when creating the response, you add the appropriate response header. So you either add the header saying: okay, this origin is allowed to exit, or you add a header. This origin is not allowed to access it. So this is an API.
We will allow course access to which is public, so basically, and we don’t require credentials, and everybody can access this information, let’s call it the public information registry or whatever. So we can even allow a wild-card origin. You can tell the browser. Okay and the origin. Is fine, I don’t care, let them exit this.
We allow the user credentials and we expose the X version header to the client-side script, so it can use the xhr, don’t get response headers X version to inspect this decider, I’m not aware of framework support. Currently, I haven’t checked this recently, but I think that a lot of frameworks already support course through some XML configuration. So you can, you can add this in an XML file and you don’t have to add address yourself. Ok, so in our application, what does this mean, or how does this work?
You want interaction from this origin to the accounts part we used to do it with an iframe through the client sites component, but now we can directly contact the public that the accounts API on the backend. So using course we are able to do it like this without having to include the clients and components and using post message to ask for information there.
I’m going to quickly wrap up what we’ve seen so. First of all, if you remember something from this session, what should it be? The origin is a core concept in web security, so design a replication around origins. Take that into account be aware that the origin doesn’t know not only represents the URL and HTTP and HTTPS but also that it’s used within the browser for security decisions. So if you have storage data storage within the browser, it’s limited / origin. If you have, you can create file systems within origins within the browser you can, if API requests permissions, for instance, to share your location, it’s typically done based on an origin, so this origin is really valuable and used for a lot of security decisions. So be very aware, this is an important concept. If you compartmentalize your application, you basically introduced natural barriers, so you prevent escalation.
If one part is compromised, you give yourself the flexibility to be able to define cookies for a specific or specific domain in this case or to use storage only in a specific origin. So this is also good practice of thinking about this. When building a web application, then it’s a really really old advice, but still really applicable, treat incoming messages as potentially untrustworthy. So basically these holes for everywhere this holes for things you get at a server-side request.
Basically, these holes for things you get from post message at the client site, these holes for data you get from a local data store that knows which script might have modified your data. So if you trusted blindly you’re vulnerable to even persistent excess injections, also at the server-side sequel injection, that’s also a problem of this statement and then finally be aware of the external parties. You trust to be aware of the trust relationship of which scripts you include and think about whether you want to do this or not investigate how trustworthy is this partner.
Have they been compromised before things like that? The first, step is being aware that you at least trust these external parties, and then you can start thinking about them. Can I accept this or not, or how can I mitigate risk for those of you interested? We have. The deliverable I mentioned before for this truce project is online on the project’s website, so it’s a general overview of web security and it covers a lot of these topics I talked about today.
So that’s definitely a good read if you want to know more about the browser security policies. You have to book the tangled web from Michelle Zaleski. It’S a google guy. He has the browser security handbook, which is a wiki online. Maybe you know of it and it rips these policies in a lot of detail.
So basically he covers the essence starting from HTML and then about how different browsers include different scripts. It can be really technical, sometimes, but after the book, you will know all the crooks and all the decisions that have been made when creating the web, and then finally, the debt of the Internet is a book that looks forward about how things can go wrong. This is a well, not an easy read so be aware when you start on it, but it discusses the motives of attackers, so it goes into detail. Why would someone want to compromise an advertisement server? How can they gain money with it what’s their incentive and they also discuss potential defenses, of course, but this is lots, take a lot more technical than the others. Okay, so that’s basically all I had for today.
Um, I don’t know if you have any questions which I might be able to answer. Yes, so it’s a request, header. So basically, if you generate a request from the browser, you cannot spoof it. So the browser prevents any API from specifying an Origin header. But if you just write whatever script, a Perl script that generates an HTTP request, then you can of course set your own values. So you should never use it for access control, but you use it to determine whether you want at least to look at the request or not. But the problem: well, it’s used for see, surf protection.
A problem with see surf is that in a sea server tech, the cookie of the user, is used to send a request and that’s typically am not possible when you have a script that you generate. Your own HTTP erica’s, because you don’t have the copy of the user. Well, if you have that and it’s game over anyway, mostly there was a Russian Russian prince is stalking. New electric was empty. Nobody can always be going to put in your request, might not know which you’re in energy to help you will see restricted on straight yeah.
That’s actually a good comment, because of the referer header, and when going from HTTPS to HTTP it was omitted by default because HTTPS is considered security-sensitive. So they don’t want to leak any information about it. So they just left out the referer header, which is not practical, with few instants defense. So this should be more robust if you’re really interested there are a specification and internet-draft about it. I guess you can check out.
You can use mode they’re, both valid defenses and but the origin header might have less over it. So yeah, I’m not sure about it. A good website for that yeah there’s a website can I use calm and, and it basically lists all the new features coming up in web in the web, and it also lists support of all browsers.
So, basically, for every feature you have like this overview table with green on red and you can quickly spot yes, the extreme options header, I guess leaving will cover that on Thursday in more detail, but it cannot be used as a defense against including third-party scripts, but It can use be used to defend your site from being included by an attacker, so it prevents a text like clickjacking and things like that, but it does not offer any protection against including a third-party script, because yeah the script can set that. But then the malicious script cannot be included in an iframe or something like that.