Production Cluster Configuration: Difference between revisions

From WilliamsNet Wiki
Jump to navigation Jump to search
No edit summary
mNo edit summary
 
(23 intermediate revisions by the same user not shown)
Line 7: Line 7:
! activity !! gitlab !! script/procedures/config !! IP !! hostname(s)
! activity !! gitlab !! script/procedures/config !! IP !! hostname(s)
|-
|-
| [[BeeGFS Installation]] || install the parallel filesystem components on controller & nodes to support the /shared filesystem || || ||
| [[Ceph Storage Cluster]] || k8s-admin || || ||
|-
|-
| [[NGINX-ingress]] || k8s-admin || || 10.0.0.111 ||
| [[Rook Storage for Kubernetes|Rook Storage]] || k8s-admin || || || (StorageClass)<br/>rook-ceph-hdd<br/>rook-ceph-nvme
|-
| [[GitLab]] || gitlab || || 10.0.0.112 || gitlab.williams.localnet <br />gitlab.williams-net.org
|-
|-
| gitlab registry secrets || || gitlab-registry-kube-system.yaml <br /> gitlab-registry-secret.yaml || ||
| gitlab registry secrets || || gitlab-registry-kube-system.yaml <br /> gitlab-registry-secret.yaml || ||
|-
|-
| [[Harbor Registry]] || k8s-admin || || 10.0.0.115 || harbor.williams.localnet
| wordpress (dredwilliams.com) || k8s/dredwilliams || || || dredwilliams.williams-net.org
|-
|-
| rsyslog|| rsyslog || || 10.0.0.113 || rsyslog.williams.localnet
| mediawiki || mediawiki || || 10.0.0.116 || wiki.williams.localnet <br />wiki.williams-net.org
|-
| mail || mail || || 10.0.0.114 || mail.williams.localnet
|-
|-
| wordpress (dredwilliams.com) || dredwilliams || || || dredwilliams.williams-net.org
| [[MariaDB]] || mariadb || || 10.0.0.117 || database.williams.localnet
|-
| mediawiki || mediawiki || || 10.0.0.116 || wiki.williams.localnet <br />wiki.williams-net.org
|}
|}


=== Storage ===
=== Storage ===
The production cluster depends on the '''/shared''' filesystem for its persistent storage.  The BeeGFS components are installed as shown here:
The production cluster depends on the '''/shared''' filesystem for its persistent storage as provided by the production Ceph cluster.  The Ceph is configured as shown here:
{| class="wikitable"
{| class="wikitable"
|-
|-
! component !! system !! location !! storage !! size
! system !! function !! storage !! size
|-
|-
| Management Server || ramandu || /home/beegfs-mgmtd || local HD || ~780G (shared)
| caspian || master || NVMe<br/>HDD || 1TB<br/>1TB
|-
|-
| Metadata Server || ramaandu || /home/beegfs-meta || local HD || ~780G (shared)
| uvilas || node || NVMe<br/>HDD<br/>HDD || 1TB<br/>1TB<br/>1TB
|-
|-
| Storage Server || ramandu || /home/beegfs-data || local HD || ~780G (shared)
| belisar || node || NVMe<br/>HDD || 1TB<br/>250GB
|}
|}


Systems that require access to both the development filesystem ('''/workspace''') and the production filesystem ('''/shared''') require a [[BeeGFS Installation#Mounting multiple filesystems on the same client|special client configuration]].
The work filesystem can be mounted via NFS:
 
10.0.0.75:/work /work nfs4 soft 0 0
 
=== Backups ===


=== Dashboard Token ===
In addition to the normal backups configured in the basic OS installation steps, the databases in the production cluster must be backed up daily using the 'mysqldump' command:
Obtain the token needed to log into the dashboard with this command:
kubectl -n kube-system describe secrets \
    `kubectl -n kube-system get secrets | awk '/clusterrole-aggregation-controller/ {print $1}'` \
    | awk '/token:/ {print $2}'   
The current token for the Production cluster is:
eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJjbHVzdGVycm9sZS1hZ2dyZWdhdGlvbi1jb250cm9sbGVyLXRva2VuLTdydDQ3Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImNsdXN0ZXJyb2xlLWFnZ3JlZ2F0aW9uLWNvbnRyb2xsZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiIwYjk1NmU5Yi01MmJiLTQwMWEtYTgwOC03MWI5YWVjNDZjNGQiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06Y2x1c3RlcnJvbGUtYWdncmVnYXRpb24tY29udHJvbGxlciJ9.bOL_ObIZ5vNkTlMd1Cdxsy6AHd_LRH-uf3-6g3YeKVoCtaKkGyR9C7mZlTQrpc6844l4sGMWBWW5HytCK9JTBoHpDADeJZQa0Q5S8cyQMPpNJUukatxzUtHN07FZ6iIl6j_wqLvVJq1dPcu_orD2HGUt7peb0FJ8Ut17opGjR9elLdR0AbZy91EJMoNj5tDCXn0-hdtjbNTu0mGzXfON9Mt3ZIjbXE31uJlji-5KfZjPzhqV0UI7v0R3yoEfPINZlqX7xmqeJt8lI0z-rgRdygLmepRaT6CYpP6IJvAsog06JpQpoU0mZmWKOqEYHS7K_AFGRV5z3vp7QLSPi1PKFA


TBD


=== Kubernetes Node Join Command ===
These commands should be inserted into the /etc/cron.daily/backup file on one of the cluster nodes (telmar is a good choice). The first does a complete database dump of the MediaWiki database server, the second dumps just the mediawiki database itself, and the third dumps the general purpose database server. Additional dump commands should be inserted for additional significant databases, as parsing individual databases out of a system dump can be tedious.
kubeadm join 10.0.0.10:6443 --token hqxg8k.bcz5utygyd2sa4yn \
    --discovery-token-ca-cert-hash sha256:ec16325aa0d701961337bc15889e8a90dd1f2d37e08f47d6211d4d7b839b4eb3 \
    --ignore-preflight-errors Swap --node-name=`hostname -s`

Latest revision as of 23:09, 14 September 2024

These packages form the basic functionality of the production cluster.

Scripts & config files are checked into gitlab under the Kubernetes group project listed.

activity gitlab script/procedures/config IP hostname(s)
Ceph Storage Cluster k8s-admin
Rook Storage k8s-admin (StorageClass)
rook-ceph-hdd
rook-ceph-nvme
gitlab registry secrets gitlab-registry-kube-system.yaml
gitlab-registry-secret.yaml
wordpress (dredwilliams.com) k8s/dredwilliams dredwilliams.williams-net.org
mediawiki mediawiki 10.0.0.116 wiki.williams.localnet
wiki.williams-net.org
MariaDB mariadb 10.0.0.117 database.williams.localnet

Storage[edit]

The production cluster depends on the /shared filesystem for its persistent storage as provided by the production Ceph cluster. The Ceph is configured as shown here:

system function storage size
caspian master NVMe
HDD
1TB
1TB
uvilas node NVMe
HDD
HDD
1TB
1TB
1TB
belisar node NVMe
HDD
1TB
250GB

The work filesystem can be mounted via NFS:

10.0.0.75:/work /work nfs4 soft 0 0

Backups[edit]

In addition to the normal backups configured in the basic OS installation steps, the databases in the production cluster must be backed up daily using the 'mysqldump' command:

TBD

These commands should be inserted into the /etc/cron.daily/backup file on one of the cluster nodes (telmar is a good choice). The first does a complete database dump of the MediaWiki database server, the second dumps just the mediawiki database itself, and the third dumps the general purpose database server. Additional dump commands should be inserted for additional significant databases, as parsing individual databases out of a system dump can be tedious.