Production Cluster Configuration: Difference between revisions

From WilliamsNet Wiki
Jump to navigation Jump to search
m (added Rook configuration)
mNo edit summary
 
(9 intermediate revisions by the same user not shown)
Line 10: Line 10:
|-
|-
| [[Rook Storage for Kubernetes|Rook Storage]] || k8s-admin || || || (StorageClass)<br/>rook-ceph-hdd<br/>rook-ceph-nvme
| [[Rook Storage for Kubernetes|Rook Storage]] || k8s-admin || || || (StorageClass)<br/>rook-ceph-hdd<br/>rook-ceph-nvme
|-
| [[Contour Ingress Controller]] || k8s-admin || || 10.0.0.115 ||
|-
| [[GitLab]] || gitlab || || 10.0.0.112 || gitlab.williams.localnet <br />gitlab.williams-net.org
|-
|-
| gitlab registry secrets || || gitlab-registry-kube-system.yaml <br /> gitlab-registry-secret.yaml || ||
| gitlab registry secrets || || gitlab-registry-kube-system.yaml <br /> gitlab-registry-secret.yaml || ||
|-
| [[Quay Container Registry]] || k8s/quay || ||  || quay.williams.localnet <br/> (CNAME to lamppost.williams.localnet)
|-
| rsyslog|| k8s/rsyslog || || 10.0.0.113 || rsyslog.williams.localnet
|-
| mail || k8s/mail || || 10.0.0.114 || mail.williams.localnet
|-
|-
| wordpress (dredwilliams.com) || k8s/dredwilliams || || || dredwilliams.williams-net.org
| wordpress (dredwilliams.com) || k8s/dredwilliams || || || dredwilliams.williams-net.org
|-
|-
| mediawiki || k8s/mediawiki || || 10.0.0.116 || wiki.williams.localnet <br />wiki.williams-net.org
| mediawiki || mediawiki || || 10.0.0.116 || wiki.williams.localnet <br />wiki.williams-net.org
|-
|-
| [[MariaDB]] || k8s/mariadb || || 10.0.0.117 || database.williams.localnet
| [[MariaDB]] || mariadb || || 10.0.0.117 || database.williams.localnet
|}
|}


Line 36: Line 26:
! system !! function !! storage !! size
! system !! function !! storage !! size
|-
|-
| calormen || master || NVME || 1TB
| caspian || master || NVMe<br/>HDD || 1TB<br/>1TB
|-
|-
| telmar || node || NVMe<br/>HDD || 1TB<br/>1TB
| uvilas || node || NVMe<br/>HDD<br/>HDD || 1TB<br/>1TB<br/>1TB
|-
|-
| compute4 || node || NVMe<br/>HDD<br/>HDD || 1TB<br/>1TB<br/>1TB
| belisar || node || NVMe<br/>HDD || 1TB<br/>250GB
|}
|}


All systems mount the '''/shared''' ceph filesystem following the directions in the installation page.  The relevant line for /etc/fstab is:
The work filesystem can be mounted via NFS:


  10.0.0.3:/ /shared ceph name=prodcluster,_netfs 0 0
  10.0.0.75:/work /work nfs4 soft 0 0
 
The client keyring must be copied from the master node (calormen) and placed in the /etc/ceph directory on the client system prior to mounting.
 
The workspace filesystem is not generally available from the production as it lives on the StorageNet VLAN, but if needed, it can be mounted via NFS:
 
controller:/workspace /workspace nfs4 soft 0 0


=== Backups ===
=== Backups ===
Line 57: Line 41:
In addition to the normal backups configured in the basic OS installation steps, the databases in the production cluster must be backed up daily using the 'mysqldump' command:
In addition to the normal backups configured in the basic OS installation steps, the databases in the production cluster must be backed up daily using the 'mysqldump' command:


  mysqldump -u root -pmenagerie --all-databases -h 10.96.244.162 > /shared/mediawiki-all.dump
  TBD
mysqldump -u root -pmenagerie bitnami_mediawiki -h 10.96.244.162 > /shared/mediawiki.dump
mysqldump -u root -pmenagerie --all-databases -h database.williams.localnet > /shared/database.dump


These commands should be inserted into the /etc/cron.daily/backup file on one of the cluster nodes (telmar is a good choice).  The first does a complete database dump of the MediaWiki database server, the second dumps just the mediawiki database itself, and the third dumps the general purpose database server.  Additional dump commands should be inserted for additional significant databases, as parsing individual databases out of a system dump can be tedious.
These commands should be inserted into the /etc/cron.daily/backup file on one of the cluster nodes (telmar is a good choice).  The first does a complete database dump of the MediaWiki database server, the second dumps just the mediawiki database itself, and the third dumps the general purpose database server.  Additional dump commands should be inserted for additional significant databases, as parsing individual databases out of a system dump can be tedious.
=== Dashboard Token ===
Obtain the token needed to log into the dashboard with this command:
kubectl -n kube-system describe secrets \
    `kubectl -n kube-system get secrets | awk '/clusterrole-aggregation-controller/ {print $1}'` \
    | awk '/token:/ {print $2}'   
The current token for the Production cluster is:
eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJjbHVzdGVycm9sZS1hZ2dyZWdhdGlvbi1jb250cm9sbGVyLXRva2VuLTdydDQ3Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImNsdXN0ZXJyb2xlLWFnZ3JlZ2F0aW9uLWNvbnRyb2xsZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiIwYjk1NmU5Yi01MmJiLTQwMWEtYTgwOC03MWI5YWVjNDZjNGQiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06Y2x1c3RlcnJvbGUtYWdncmVnYXRpb24tY29udHJvbGxlciJ9.bOL_ObIZ5vNkTlMd1Cdxsy6AHd_LRH-uf3-6g3YeKVoCtaKkGyR9C7mZlTQrpc6844l4sGMWBWW5HytCK9JTBoHpDADeJZQa0Q5S8cyQMPpNJUukatxzUtHN07FZ6iIl6j_wqLvVJq1dPcu_orD2HGUt7peb0FJ8Ut17opGjR9elLdR0AbZy91EJMoNj5tDCXn0-hdtjbNTu0mGzXfON9Mt3ZIjbXE31uJlji-5KfZjPzhqV0UI7v0R3yoEfPINZlqX7xmqeJt8lI0z-rgRdygLmepRaT6CYpP6IJvAsog06JpQpoU0mZmWKOqEYHS7K_AFGRV5z3vp7QLSPi1PKFA
=== Kubernetes Node Join Command ===
kubeadm join 10.0.0.10:6443 --token hqxg8k.bcz5utygyd2sa4yn \
    --discovery-token-ca-cert-hash sha256:ec16325aa0d701961337bc15889e8a90dd1f2d37e08f47d6211d4d7b839b4eb3 \
    --ignore-preflight-errors Swap --node-name=`hostname -s`

Latest revision as of 23:09, 14 September 2024

These packages form the basic functionality of the production cluster.

Scripts & config files are checked into gitlab under the Kubernetes group project listed.

activity gitlab script/procedures/config IP hostname(s)
Ceph Storage Cluster k8s-admin
Rook Storage k8s-admin (StorageClass)
rook-ceph-hdd
rook-ceph-nvme
gitlab registry secrets gitlab-registry-kube-system.yaml
gitlab-registry-secret.yaml
wordpress (dredwilliams.com) k8s/dredwilliams dredwilliams.williams-net.org
mediawiki mediawiki 10.0.0.116 wiki.williams.localnet
wiki.williams-net.org
MariaDB mariadb 10.0.0.117 database.williams.localnet

Storage[edit]

The production cluster depends on the /shared filesystem for its persistent storage as provided by the production Ceph cluster. The Ceph is configured as shown here:

system function storage size
caspian master NVMe
HDD
1TB
1TB
uvilas node NVMe
HDD
HDD
1TB
1TB
1TB
belisar node NVMe
HDD
1TB
250GB

The work filesystem can be mounted via NFS:

10.0.0.75:/work /work nfs4 soft 0 0

Backups[edit]

In addition to the normal backups configured in the basic OS installation steps, the databases in the production cluster must be backed up daily using the 'mysqldump' command:

TBD

These commands should be inserted into the /etc/cron.daily/backup file on one of the cluster nodes (telmar is a good choice). The first does a complete database dump of the MediaWiki database server, the second dumps just the mediawiki database itself, and the third dumps the general purpose database server. Additional dump commands should be inserted for additional significant databases, as parsing individual databases out of a system dump can be tedious.