Drupal.org

Drupal.org infrastructure plan diagrams: old, new, future

Project:Drupal.org infrastructure
Component:Other
Category:task
Priority:normal
Assigned:Amazon
Status:closed (fixed)

Issue Summary

Hi, as part of hardware improvements authorized by the Drupal association board I am putting together some diagrams to explain our infrastructure plans.

The source files are omnigraffle, so make comments or mark-up the PDFs and I will make adjustment. The diagrams are in PDF.

AttachmentSize
Drupal-Infrastructure-plan-diagrams.pdf386.76 KB

Comments

#1

In this draft I added a diagram for logical infrastructure, hardware specifications, and services running on each server.

AttachmentSize
Drupal-Infrastructure-plan-diagrams_0.pdf 386.76 KB

#2

Let's try again.

AttachmentSize
Drupal Architecture.pdf 639.6 KB

#3

More improvements. Added recommendations.

AttachmentSize
Drupal Architecture_0.pdf 644.42 KB

#4

Kieran

I reviewed the latest diagram. This is much needed.

Your recommendation on increasing the number of admins is spot on, and I volunteer for it (among many others).

I am wondering how the MySQL slaves would help now, as we don't have a way to channel writes to the master, and reads to slaves. There is a patch in the queue, but it has not been committed yet. We are moving to auto increment which are known to have issues with master/slave and therefore this patch is necessary. David Strauss can fill us in with a summary.

I am also wondering if converting the database to InnoDB made matters worse. Although row locking happens rather than table locking, InnoDB still locks the entire table on an autoincrement insert (which we will have in the future, see above).

The other thing is a suggestion: since the slowness is from a few queries, how about a drastic temporary solution: remove the functionality temporarily. For example the tracker we have now is semi functional after a patch hit Drupal 5 for it. We miss a lot of the old functionality but can live without it.

So how about a review for identifying heavy queries, and then cutting down functionality for some of them?

#5

This should be a separate issue, but here it goes.

MySQL tuning:

We need to add this line to my.cnf: log-slow-queries = /var/log/mysql/mysql-slow.log , and then restart MySQL.

Tune queries first: MySQL slow log parser

Then tune your engine: Show InnoDB status walk through

Then tune your settings: MySQL-Server-Settings-Tuning

In particular we should do "show variables"

  1. innodb_thread_concurrency can cause thrashing
  2. innodb_log_file_size dramatically effects performance
  3. innodb_commit_concurrency - Number of threads allowed at commit stage at the same time

#6

The other thing is a suggestion: since the slowness is from a few queries, how about a drastic temporary solution: remove the functionality temporarily. For example the tracker we have now is semi functional after a patch hit Drupal 5 for it. We miss a lot of the old functionality but can live without it.

Remove tracker!? Please don't do this. It's already a pain having to go back through pages of my tracker to find things that have been responded to. Without tracker, it would be pretty well impossible for me to, well, track them.

Michelle

#7

We've never had FTP running on any of our machines. :p

#8

Here's the latest diagrams. I have removed FTP from the old infrastructure.

I'd also like to get more recommendations to go with the diagrams.

AttachmentSize
Drupal Architecture_1.pdf 646.96 KB

#9

there are some errors in the graphs:
* i dont know the infrastructue of d.o but i dont think think there is a dedicated firewall, just some iptables on the web boxes themselves (note: i like that design!)

*the firewall isnt loadbalancing but DNS is (I dislike this, stupid, unscalable, undectable, before I contacted OSUSL the TTL was even on a day or two, lower now, not ethat even browsers cache DNS info so expect downtimes for half the world for at least an hour if one box breaks)

* tweaking NFS is an art. Is it UDP or TCP based? Even if you go from fileserver to NAS (iSCSI ?) tweaking NFS is necessarily.

* p.6: note that there are dozens of ways to "even distribute the load" on webservers, it is rather hard to tweak but default (often: if alive on HTTP HEAD choose one with fastest response) might be good enough.

* p.6: I think it is better to put the squids on the loadbalancer. Typically 30-70 % will be cached and you want this as high in the network as possible

*p 7: there should be lines from both loadbalancers to the webservers and cross connects as well (heartbeat)

* we need to specify what box can get what protocol to where (ACL's). For example, the PHP servers should be able to resolve and do HTTP get's so they need a normal proxy

* p. 9: here the reverse proxy is on the lb's?

* i dont see where all the pages are for?

* note: missing and important IMHO is an OOB management network! we really need this

* old one: third DNS outside OSUOL AS and domainspace.

* good work. but needs love :-) if you need help, please contact me

#10

> there are some errors in the graphs:
> * i dont know the infrastructue of d.o but i dont
> think think there is a dedicated firewall, just
> some iptables on the web boxes themselves (note:
> i like that design!)

Correct, just iptables. Just throw a cloud at the top with two lines coming out of it, that should represent the VIP hitting one of the load balancers.

> * the firewall isnt loadbalancing but DNS is (I
> dislike this, stupid, unscalable, undectable,
> before I contacted OSUSL the TTL was even on a
> day or two, lower now, not ethat even browsers
> cache DNS info so expect downtimes for half the
> world for at least an hour if one box breaks)

Neither the `firewall' *nor DNS* is doing load-balancing now. The load-balancer is what distributes load against the web-nodes. Perhaps you are referring to the old setup? As a matter of fact, the only email I see from you on the subject was informing us our TTL of 5 minutes was too low back in October '05. I can also see from our DNS management app that the 10 minutes that it was set to prior to the LVS/IPVS setup was the TTL for a least a year back. Yes, I agree that balancing via DNS is poor, but that's why we switched away from it, and I don't believe we ever had it set for a day. If you are referring to the method of how a load balancer is chosen, there is in fact only one VIP which is obtained and held by either of the boxes in a heartbeatd-like fashion.

> * tweaking NFS is an art. Is it UDP or TCP based?
> Even if you go from fileserver to NAS (iSCSI ?)
> tweaking NFS is necessarily.

No, we don't just install things and hope they work without tweaking them here. We're running `rsize=32768,wsize=32768,tcp,nfsvers=3,hard,intr' for our options at the moment.

> * p.6: note that there are dozens of ways to "even
> distribute the load" on webservers, it is rather
> hard to tweak but default (often: if alive on
> HTTP HEAD choose one with fastest response)
> might be good enough.

For the moment, we are using IPVS with the Weighted Least-Connection Scheduling algorithm, which is doing an excellent job of distributing load with the proper weights on each box.

> * p.6: I think it is better to put the squids on
> the loadbalancer. Typically 30-70 % will be
> cached and you want this as high in the network
> as possible

We can look into this. At them moment we are using IPVS direct-routing, which functions by rewriting MAC addresses on packets. It requires a bit of ARP trickery in the networking stack, but is extremely fast and introduces practically no delay on requests. At the moment, having multiple Squid processes does mean having duplicate caches containing similar information on each web node, even though running Squid does little to hurt available memory or the load. Moving squid to the load balancer means changing from a network layer load balancing system to an application-level load balancer. It may function better, it may not, but regardless there are more pressing things to be done on this infrastructure before I will have time to look at this.

Also for page 6: the firewall doesn't exist, so it can't have mysql4.0. The NFS is being served from Drupal1, not the database server, as the RAM needed for NFS was a worse blow to the taxed DB then to the web node. Also, not just the files dir, but the whole webroot is on NFS. This also means you can remove the rsync line. We may keep mailman on one of the drupal boxes, but for the sake of completeness in the diagram I would put `Lists' under the OSU OSL Mail heading.

> * p 7: there should be lines from both
> loadbalancers to the webservers and cross
> connects as well (heartbeat)

True. Actually, I'm not sure what page 7 is supposed to be, its not where we're headed; web 3 should be behind a load balancer, for instance.

> * we need to specify what box can get what
> protocol to where (ACL's). For example, the PHP
> servers should be able to resolve and do HTTP
> get's so they need a normal proxy

I don't follow.

> * p. 9: here the reverse proxy is on the lb's?

Yeah, you really can't have both IPVS and Squid on the LBs.

> * i dont see where all the pages are for?
>
> * note: missing and important IMHO is an OOB
> management network! we really need this

There's actually 4 networks involved: the front-facing, public IP space; the backend network for services (NFS, DB, memcache); the OOB network for HMC access, which remains only accessible to OSL systems staff; and the cross-over cable between the DB boxes that they'll be replicating on.

> * old one: third DNS outside OSUOL AS and
> domainspace.

We've got one, looks like it just wasn't inputted to the parent nameservers/registrar. It's about a thousand miles away, different subnet and peering. It's still a .osuosl.org address, I wasn't aware that `domainspace' really mattered for this. I've mentioned this in another thread so we can have it added.

> * good work. but needs love :-) if you need help,
> please contact me

/me heads back to actually implementing this stuff instead of just talking about it :-)

-Eric

#11

Note: you mention various software tricks that can make future hardware buy unnecessary. It may push it out but eventually there is only so much a given box can do.

#12

When finished, will this be posted on http://infrastructure.drupal.org/?

#13

Actually, when finished it will be posted on drupal.org home page. But yes, I'll keep source files on i.d.o.

#14

I may take a stab at a version in somethign like SVG too.

#15

Nice proposal. Note that the NFS is still a single point of failure, though. What are the network speeds of both the uplink and internal networks? Will there any provisions for realtime off-site backups?

#16

sepeck said:
> I may take a stab at a version in somethign like SVG too.

Yeah, OmniGraffle is nice (I have it too), but for a more open (and RCS-friendly) format, I've always liked graphviz (dot). Unlike point-and-click drawing programs, graphviz generates directed and undirected graphs from an input that describes nodes and connections. You can generate everything from SVG and EPS to clickable image maps.

Here's a simple example, nothing fancy like the backend network yet.

Example:
png output: http://staff.osuosl.org/~emsearcy/drupalinfra.png
dot source: http://staff.osuosl.org/~emsearcy/drupalinfra.dot

(Note: I used dashed lines to indicate that the second load balancer is for failover, and while it maintains a connection to the webnodes for health checking, traffic isn't going over those lines 99% of the time.)

George, yes NFS is a single point of failure, even though the box it is on has redundant hardware. However, the performance and reliability of NFS is much more than you would get from a redundant option like Coda---and an enterprise-level high-availability NFS server cluster is overkill here, I think :-). The OSL's main motivation for the Drupal cluster at the moment is for the performance improvements. We're putting in redundancy where we can too.

Both networks are running at 100Mbps/full duplex on enterprise switches (Cisco for the frontend, managed HP switch on the backend).

No offsite backups at the moment---just daily snapshots on a separate system. An offsite system is in the works.

#17

Here's the latest diagrams in PDF. Omnigraffle source files going to Tony for review and updating.

To Eric and others. These diagrams are aiming to tell a story, to support fund raising. Old diagrams relate to past problems. New diagrams indicates you are working on it and making great progress. Future diagrams, indicated where we need to be for 250% growth in 2007 and 250% growth in 2008. The future diagrams are a goal to provide the financial support to help us execute it.

It would also help for the Drupal association to communicate a plan to ensure the community that current service levels on Drupal.org are going to improve and that improvements are being worked on. These diagrams are part of explaining that plan.

AttachmentSize
Drupal Architecture_2.pdf 655.23 KB

#18

linking: tar of graffle

#19

Is there a reason we're using dual-Xeon servers? The application servers have very processor-intensive, multithreaded workloads. It would be more effective to have one box with a few modern, quad-core processors than a load-balanced array of dual-Xeon boxes.

#20

Mail that quad server to:

OSU OSL
Kerr Admin B210
Corvallis OR 97331

When should we expect it?

;-)

#21

I caught a few references to NFS and other network file systems. We should not be using a network file system for running Drupal.org. It's unreliable and slow. We should do what Wikimedia does:

* Have a test/scratch server to update and evaluate rollouts before going live. Generally, this copy is checked directly out of SVN.
* Have a script to synch the software out to the local disks of live servers. This may consist of an rsync from the test server or running an svn update to the tag tested on the test server. rsync is a bit more flexible for applying patches intended only for the live site. (Having special patches for the live site doesn't apply to Wikimedia because MediaWiki is written primarily for running Wikipedia. If Wikipedia needs something, it goes right into SVN.)

#22

This is not about the Drupal code.

If I am not mistaken, NFS is used for the stuff in the files directory (issue patches, images, ...etc.), so it is shared between the two servers.

NFS has a bad reputation, but IMO it is undeserved in such a case.

#23

Instead of using NFS to appear to host all files on all servers, we should consider establishing a static.drupal.org server that natively hosts all static files. The application servers could still use NFS for manipulating the files on the static server, but the application servers would not serve the static files. Ideally, we could add a configuration option that allows specifying a public URL for files.

All of this may not be worth considering if the static file load is small.

#24

Eric explains the rational earlier in the thread.

Static files are served from FTP server.

NFS does a fine job so far, and replaced rsync which used to be used. Yes, the failings of NFS not being run in enterprise class service are understood and not considered a top priority.

Let's stop kicking about stuff that is working and keep focused on the stuff we can help with. E.g. master-slave patch, slow query logs.

#25

Any more feedback about the diagrams? Eric, any changes before I post a link to the diagrams on Drupal.org? I want to be sure I am reflecting our future plans.

Kieran

#26

Why not moving sessions to a separate DB server with failover? nothing important inside, but this will unload the master machine.

#27

Additional the Load Balancers should be active/active. Diagram look active/passive - F5 BigIP's can do this :-).

And i'm missing any plans about SSL cards for SSL logins...

#28

one shot added with network (BGP and switching). I know this is boxes and arrows stuff but hope it helps

I made up a storage layer, NAS. This would really help OSU and Drupal to have a shared quality low cost diskpspace.

AttachmentSize
alinfra.png 209.66 KB

#29

one more shot with "ACL's" to see flows.

AttachmentSize
flows.png 183.35 KB

#30

and for those who are into windows... :-(

the viso attached, renamed to infra.vsd.png due to upload restrictions, please save and "mv infra.vsd.png infra.vsd"

AttachmentSize
infra.vsd_.png 485 KB

#31

Bert: Just a small comment on the "visual" aspect of the schema: not all text is easy to read, and various things are not properly centered. Not important at all but maybe something to massage in future versions.

There are also machines that have no label. Are these just to impresS? ;-)

What is the difference between a dotted box and a solid box?

#32

** subscribe **

#33

I think the diagram of the "new" setup is outdated, since I think there was an upgrade to MySQL 5.0?

#34

Here's the latest diagrams. There were a lot of change over the summer so I indicated the new stuff was around April 2007. I also added a testing.drupal.org server to the future diagrams.

I have added the slides from the scaling the Drupal.org infrastructure talk.

AttachmentSize
druapl-org-scaling-infrastructure-barcelona.pdf 476.08 KB
Drupal.org-Architecture.pdf 1.15 MB

#35

the unlabeled machines have generic function names (I do not have a deep inside in the internal structure but I know the machines are there)

I do not see any dotted lines and the text seems to be readable and centered. are we talking about the same graphs?

#36

Status:active» fixed

Closing this issue. It's almost been a year, so it would be good to get them reviewed, but I'll start another issue to do that.

I've asked for help from the new redesign infrastructure team: http://groups.drupal.org/node/15123

#37

Status:fixed» closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.

nobody click here