diff options
| author | Suren A. Chilingaryan <csa@suren.me> | 2018-03-20 15:47:51 +0100 | 
|---|---|---|
| committer | Suren A. Chilingaryan <csa@suren.me> | 2018-03-20 15:47:51 +0100 | 
| commit | e2c7b1305ca8495065dcf40fd2092d7c698dd6ea (patch) | |
| tree | abcaa7006a9c4b7a9add9bd0bf8c24f7f8ce048f /docs/README | |
| parent | 47f350bc3aa85a8bd406d95faf084df2abf74ae9 (diff) | |
| download | ands-e2c7b1305ca8495065dcf40fd2092d7c698dd6ea.tar.gz ands-e2c7b1305ca8495065dcf40fd2092d7c698dd6ea.tar.bz2 ands-e2c7b1305ca8495065dcf40fd2092d7c698dd6ea.tar.xz ands-e2c7b1305ca8495065dcf40fd2092d7c698dd6ea.zip | |
Local volumes and StatefulSet to provision Master/Slave MySQL and Galera cluster
Diffstat (limited to 'docs/README')
| -rw-r--r-- | docs/README | 91 | 
1 files changed, 91 insertions, 0 deletions
| diff --git a/docs/README b/docs/README new file mode 100644 index 0000000..4f75b5b --- /dev/null +++ b/docs/README @@ -0,0 +1,91 @@ +OpenShift Platform +------------------ +The OpenShift web frontend is running at  +     https://kaas.kit.edu:8443 + +However, I find it simpler to use command line tool 'oc' which  + - On RedHat platforms the package is called 'origin-clients' and is installed  + from OpenShift repository available as package  'centos-release-openshift-origin'. + - For other distribut check here (we are running version 3.7) +    https://docs.openshift.org/latest/cli_reference/get_started_cli.html#installing-the-cli + +Basically, it is also a good documentation to start using it. +    https://docs.openshift.com/container-platform/3.7/dev_guide/index.html + +Infrastructure +-------------- + - We have 3 servers running with names ipekatrin[1-3].ipe.kit.edu. This is internal names. The external  + access is provided using 2 virtual ping-poing ip's katrin[1-2].ipe.kit.edu. By default they are assigned  + to both master servers of the cluster, but will migrate both to a single surviving server if one of the  + masters die. This is enabled by keepalived daemon and ensures load-balancing and high-availability.  + The domain name 'kaas.kit.edu' is resolved to both ips in round-robin fashion.  +  + - By default, the executed service have have names in the form '<service-name>.kaas.kit.edu'. For instance, + you can test +    adei-katrin.kaas.kit.edu            - This is a ADEI service running on the new platform +    adas-autogen.kaas.kit.edu           - Sample ADEI with generated data +    katrin.kaas.kit.edu                 - Is the placehorder for futre katrin router +    etc. +  + - OpenVPN connection with KATRIN virtual network is running on master servers. Non-masters route the traffic + trough the masters using keepalived IP. So, katrin network should be transparently visible from any pod in + the cluster. + +Users +----- + I have configured a few user accounts using ADEI and UFO passwords. Furthermore, to avoid a mess of  +conteiners, I have created a number of projects with appropriate administrators. +  kaas  (csa, kopmann)  - This is a routing service (basically Apache mod_rewrite) to set redirects from http://katrin.kit.edu/* +  katrin (katrin)       - Katrin database +  adei (csa)            - All ADEI setups +  bora (ntj)            - BORA +  web (kopmann)         - Various web sites, etc. +  mon (csa)             - Monitoring +  test (*)              - Project for testing + +If needed, I can create more projects/users. Just let me know. + +Storage +------- + I have created a couple of gluster volumes for different purpose: +    katrin_data:        - For katrin data files +    datastore           - Other non-katrin large data files +    openshift           - 3 times replicated volume for configuration, sources, and other important small files +    temporary           - Logs, temporary files, etc. +     + Again, to not mess data from the different projects, on each volume there are subfolders for all projects. Furthermore, + I have tried to add a bit of protection and assigned each project a range of group ids. The subfolders can only be read + by appropriate group. I also pre-created correpsonding PersistentVolume (pv) and PersistentVolumeClaims (pvc): 'katrin', 'data', ... +  + There is a special pvc called 'host'. This is to save data on the local raid array bypassing gluster (i.e. on each OpenShift node + the content of the folder will be different). +  + WARNING: Gluster supports dynamic provisioning using Heketi. It is installed and worked. However, heketi is far from being  + of production quality. I think it is OK to use it for some temporary data if you want, but I would suggest to use pre-created  + volumes for important data. + + - Curently, I don't plan to provide access to the servers itself. The storage should be managed from the OpenShift pods solely. + I made a sample 'manager' pod equipped with scp, lftp, curl, etc. It mounts all default storage. You need to start it and, then, + you also can connect interactively either using both web interace and console app. +        oc -n katrin scale dc/kaas-manager --replicas 1 +        oc -n katrin rsh dc/kaas-manager + Just an example, build your own configuration with required set of packages. +  +Databases +--------- + Gluster works fine if you mostly read data or if you perform mostly sequential writes. It plays very bad with 'databases' and similar + loads. I guess it should not be issue for Katrin database as it is relatively small (AFAIK) and do not perform many writes. For something, + like ADEI the gluster is not viable option to back MySQL server. There are several options to handle volumes for appliations performing a + large amount of small random writes: +    - If High Availability (HA) is not important, just pin a pod to a certain node and use 'host' pvc. +    - For databases, either Master/Slave replication can be enabled (you will still need to pin node and use 'host' pvc). The Galera cluster  +    can be installed for multi-master replication. It is configured using StatefulSet feature of OpenShift. I have not tested recovery throughly,  +    but it is working, quite performant, and masters are synchronized without problems. +    - For non-database applications, the Gluster block storage may be used. The block storage is not shared between multiple pods, but private +    to a specific pod. So, it is possible to avoid certain amount of locking and context switches. So, performance is significantly beter. I was +    even able to run ADEI database on top of such device. Though it is still singificnatly slower than native host performance. There is again +    heketi-based provisioner, but it works even worse when one providing standard Gluster volumes. So, I suggest to ask me to create block +    devices manually if necessary. +     + Otherall, if you have data intensive workload, we can discuss the best approach. + 
\ No newline at end of file | 
