[email protected]
A Developper Introduction to Web Technologies Historic, evolution and principles From static ... to server-side … to client-side … to the infinity (and beyond)
Overview ●
Internet Protocols : Tcp/IP, HTTP
●
HTTP Headers, Optimisation, Security
●
Server-side Dynamic Content
●
●
–
Java Servlets, Jsp
–
Apache, Tomcat, Architecture webs
–
MVC Frameworks
Client-side Dynamique Content –
JavaScript
–
GWT : Google Web Toolkit
–
Jquery, Angular.JS
Json + JavaScript everywhere: NoSQL, Node.Js ...
Key Layers of the Internet
http://en.wikipedia.org/wiki/Internet#mediaviewer/File:Internet_Key_Layers.png
Historic Evolution = Low Level → High Level Time
Abstractness
1996
1993
Application
1990
1975
Protocol & Software Hardware
Low-Level
Quizz ●
What means –
Internet
–
Ethernet
–
Intranet
–
WorldWideWeb
–
Http/Html HyperText
–
http://www
–
w3c
Internet ?
http://en.wikipedia.org/wiki/Internet
Internet : Servers ← Networks→ Clients(You)
Internet = TCP/IP = “The” NET (hardware) + TCP/IP (software protocol) IP v4 / v6
OSI Model : 7 levels
http://en.wikipedia.org/wiki/OSI_model
Internet TCP/IP OSI model
010100101011 Packets
Client 7->6...1 → Server 1->2-> ..7
TCP/IP for Network Admins Syn-Ack-Syn … Data/Ack .. Close/Ack
TCP/IP For Developers ●
reliable point-to-point data transfer
●
duplex channel : 2 independent pipes (4 end points)
●
You write(resp. read) => dest. will read(resp. write)
localSocket.write(data)
011000101...
localSocket.read(data)
0100111 ...
...011000101
remoteSocket.read(data)
...0100111001
remoteSocket.write(data)
Client-Server Role Assymetry ●
Usually not 2x2 threads in parallel
●
For simplicity : Sequential … –
●
Client : Write then Read / Server : Read Then Write
Socket is used to encode “Request – Response“
Client-Server Role Assymetry ●
“marshalling” : encoding with end delimiter
●
Server read up to delimiter, then process …
●
Idem for response (1)
(1) write req (marshal)
(2)
(3) (4) time
(4) read resp. (unmarshal)
(2) read req (unmarshal)
(3) write resp. (marshal)
HTTP Protocol Encoding … see later ●
●
ULTRA SIMPLE encoding of request/reponse line oriented protocol with any key=value header “blank link” as delimiter binary content size as header More on this later…
Socket For System OS ●
File Handle
●
4 Numbers: IP source/dest + Port source /dest
●
2 Buffers : –
write buffer (unsent yet … or unack yet)
–
Read buffer : acked,but not read yet
Socket : 6 System Calls ●
6 system calls: – –
Ip = getaddrinfo(hostname) Client only: socket = connect()
–
Server only : serverSocket = listen() socket = serverSocket.accept()
–
Symmetric : socket.write(data), socket.read(data)
IP Routing ●
OSI Layer 2 = routing of IP packets
●
DNS : convert Hostname to IP addresses –
●
Redirect DNS server for sub-domain name
Utility to trace IP route : “traceroute” !
DNS Lookup example IP(s) = nslookup(hostname)
Note: Google has several Ips ! For load-balancing, Fault Tolerance ... And many more servers back-end !
Https://173.194.45.233
Another (Socket) Test : Telnet
CURL : command line tool for HTTP With verbose option … see IP, header, ...
Summary for “HTTP GET /” at TCP/IP Level ip = getaddrinfo(“google.com”) port = 8443; // http: 8000 … SSL socket = connect(ip, port)
Connect (Syn,Ack,Syn)
s = sslSocket(socket) … exchange SSL certificates
s.write(“HTTP 1.0 GET\n“) s.write(“.\n”) s.write(“\n”) s.flush() …. s.read() s.close()
write read
close
Summary of “HTTP GET /” at HTTP level Line 1: Request (verb + URL + url params) GET HTTP/1.0 / header
Request
key1=value1 key2=value2
“.” separator body
00110101 Line 1: Status code Response header
200 OK key1=value1 key2=value2
“.” separator body
00110101
HTTP Status Code 200, 404, ... http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html ●
●
●
●
200 = OK 3xx = redirection … 302 : moved permanently 304 : Not Modified 4xx = client error 401 : unauthorized 403 : forbidden 404 : resource not found 5xx = server error 50: internal server error
404 ? not only a car
HTTP Restrictions ? ●
●
ANY Header Keys text allowed ! ANY Header Value text allowed (encode as hexa)
●
Any Binary Content allowed
●
In both way (Request and Response)
Protocols “over HTTP” ●
●
HTTP headers can pass extra (meta-)data (security, session, version info, supported langages...) HTTP as a “meta protocol” can be used to build protocols “over HTTP” –
●
●
Level 7 => used as a Level 3 transport !
Advantages : …. will pass all web firewalls ... Examples : Web Service, Rest Json, Protobuf over HTTP, Burlap, Hessian, ...
Header Keys : Std, Optional, Unrecognised ... ●
●
Some standardised keys for Web (web-browsers / web-servers) When unknown key => ignore key !!!
Example Header Langage Negociation
Accept-Language:, Accept: Please, talk to me in “american english” otherwise “english” and in html format … otherwise xhtml, ..
Browser Langage Preference
What's Next ? Internet, HTTP Protocol
HTTP Cookie Session HTTP Security Authentication
Close Socket after each HTTP GET ?
See Optional HTTP Header negociable option Connection: keep-alive
Hit a different clustered server at each Http GET ?
So Where is my Session ? Here : as Cookie ( or HTML5 local Storage)
NOT HERE NOT in your(s) connection(s) !
Session Cookie: ●
“Cookie:” header … uniquely generated for session
●
Stored on browser, passed to each requests
●
●
=> emulate a session over multiple connections (OSI Level 4 > OSI Level 3) CAN be used for authentication BUT not always … (anonymous users can have session)
Set-Cookie: => Cookie:
Session using Cookie + Server DB Set-Cookie: => Cookie: JsessionId = new Random number … INSERT into SESSION (sessionId,data) values (...)
Select data from SESSION where sessionId=?
Security ABC ●
●
●
Encryption use SSL to encrypt any transfert http => httpS Authentication user must identifies by a private secret user/password, RSA keys, ... Authorization resources permissions controlled per user/group (role)
Security Authentication 2 Modes : Login.html / Basic
Using /login.html page redirection
Using Basic authentication
Security using 302 … redirect to login.html HTTP GET /any/resource
HTTP redirect... login.html
HTTP GET /login.html
HTTP POST /login.html … user+password
HTTP redirect /any/resource
Security using POST login user/password … Set-Cookie: SessionId HTTP POST …login user+password
302.. redirect to previous URL
“Logout” = Clear Session
Redirected to home page … (then login page again) Cookie still present on client … but unrecognised on server!!!
“Logout” = Clear Session
Redirected to home page … (then login page again) Cookie still present on client … but unrecognised on server!!!
Browser clear Cookie to “Logout”
Sample Spring Security Login Page
Security with Basic Realm WWW-Authenticate => Authorization Native modal dialog opened in browser
Sample Spring Security for Basic Auth
Basic Authentication 401 : Bad Credentials
What's Next ? Internet, HTTP HTTP Cookies & Security
HTTP Performance Tuning HTTP Header negociation Scalable Web Architectures
HTTP Performances ? Reminder … HTTP over TCP TCP Connect (Syn,Ack,Syn)
SSL certificates
write read
close
Header Optim : TCP socket keep-alive:
TCP Connect (Syn,Ack,Syn) SSL certificates
write read close
Close vs Keep-Alive
write read
Optim GZIP Negociation Accept-Encoding: => Content-Encoding:
You can talk to me in GZIP ... I am able to decompress Lets save some bandwith
content to GZIP 01101010100101 Accepted gzip ? ... GZIP to content
Browser Cache If-Modified-Since: => Last-Modified:
I already have an old cached version from date yyy/mm/dd send me only if it is newer
You Might … But Should Avoid … Sticky Session = Affinity to Server
You Might … And Usually Do (Web Giants Don't)
Single SPOF : Database … not clustered !
Login then ask cookie: sessionId
Pass cookie http header : sessionId
Persist Session
Load Session by sessionId
Individual Servers crashes ... quite often Server can be Switch ON / OFF At ANY TIME … without LOSS “Data” must always be replicated
even a Database is a SPOF (Single Point of Failure)
PC/Disk Mean Time Between Failure ~ 2 years => in DataCenter : 1 crash every 10 minutes
Design for High Availability …
Design for High Availability ...
Wait … Why Did You Need a Session ? ●
Put client data on the client !!
●
Expose Ajax + RESTFULL Json Services
●
Use Single Page web application: AngularJS …
What's Next ? Internet = TCP/IP + HTTP protocol HTTP Security & Session
Hyper Text HTML Web = ? WorldWideWeb = ? JavaScript, CSS
Google “www” Facebook is rank #1!
The First “Hyper Text“ HTTP + HTML = “WorldWideWeb” ● ●
●
HTTP : 1989 (Tim Berners-Lee) at CERN HTML : 1990 (Tim Berners-Lee) “WorldWideWeb” server & browser W3C Consortium : 1994 (Tim Berners-Lee)
http://www.w3.org/People/Berners-Lee/WorldWideWeb.html
WWW's Dad
Internet WWW Facts
WWW Google Facts
WWW Facts
Today's Servers : DataCenter
Today's Clients : Browsers in Mobile & PC
Browsers
Blink = fork of Webkit Chrome → blink Android → blink
http://en.wikipedia.org/wiki/Web_browser
FireFox WebConsole
Chome Developer Tools
WebConsole : Debug Network “Http GET /”
Idem Chrome … Net (+ DOM events)
more detail... TCP analysis (without SSL) ●
http://arnaud-nauwynck.github.io
Eth0 Sniffer : Wireshark Select packet
Level 1: Ethernet router mac address Level2: IP IP src/dest address Level 3: TCP TCP src/dest Port Level 4..7: Http “GET / HTTP/1.1”
Notes on Sniffer ●
MUST be ROOT to put card in “promiscuitous mode”
●
Too Low-level capture all IP traffic (logs pollution)
●
Not efficient for applicative HTTP debugging
●
Better tools: –
“Man-in-the-middle”, Proxy, Browser console
Proxy … simply by pass request/response
Proxy
Server
Proxy, Man-in-the-middle sample tools ●
At TCP level : –
●
At TCP level with SSL –
●
apache-tcpmon
Sshproxy
At HTTP level: –
●
At HTTP level with SSL support –
“WebScarab”, “mitmproxy”, ...
Configuring Browser Proxy
check Proxy used … but not started yet !
Starting Server HTTP Proxy WebScarab
View HTTP Traffic
HTTP Request-Response Details
Testing HTTPS with SSL untrusted Certificate replaced...
Trust Certificate ? … Add It
Certificate Details
Certificate Trust Chain Trust Root CA =>Trust … =>Trust Fake certificate … trusted by nobody from my local PC “arn2”
Real certificate trusted by/by/by...
Man In The Middle ●
●
A Classical problem in network security … Alice send encrypted message to Bob but … Mallory intercept decrypt and re-encrypt