# Operations guides for Sourcegraph on Kubernetes

<Callout type="warning">
	The Kustomize deployment type is planned for deprecation and will be sunset in a future release. We recommend using the [Helm deployment](/self-hosted/deploy/kubernetes) for all new Kubernetes installations.
</Callout>

Operations guides specific to managing [Sourcegraph on Kubernetes](/self-hosted/deploy/kubernetes/) installations.

<QuickLinks>

    <QuickLink title="Installation" icon='lightbulb' href="/self-hosted/deploy/kubernetes/" />
    <QuickLink title="Introduction" icon='theming' href="/self-hosted/deploy/kubernetes/kustomize" />
    <QuickLink title="Configuration" icon='installation' href="/self-hosted/deploy/kubernetes/configure" />
    <QuickLink title="Maintenance" icon='presets' href="#" />

</QuickLinks>

## Featured guides

Trying to deploy Sourcegraph on Kubernetes? Refer to our [installation guide](/self-hosted/deploy/kubernetes/#installation).

## Configure

We strongly recommend referring to our [Configuration guide](/self-hosted/deploy/kubernetes/configure) to learn about how to configure your Sourcegraph with Kubernetes instance.

## Deploy

Refer to our [installation guide](/self-hosted/deploy/kubernetes/) for details on how to deploy Sourcegraph.

Migrating from another [deployment type](/self-hosted/deploy/)? Refer to our [migration guides](/self-hosted/deploy/migrate-backup).

## Deploy with Kustomize

In order to deploy Sourcegraph that is configured for your cluster:

#### Building manifests

Build a new set of manifests using an overlay you've created following our [configuration guide for Kustomize](/self-hosted/deploy/kubernetes/kustomize/):

```bash
$ kubectl kustomize $PATH_TO_OVERLAY -o cluster.yaml
```

#### Reviewing manifests

Review the manifests generated in the previous step:

```bash
$ less cluster.yaml
```

#### Applying manifests

Run the command below to apply the manifests from the ouput file `cluster.yaml` to the connected cluster:

```bash
$ kubectl apply --prune -l deploy=sourcegraph -f cluster.yaml
```

Once you have applied your changes:

-   _Watch_ - verify your deployment has started:

    ```bash
    $ kubectl get pods -A -o wide --watch
    ```

-   _Port-foward_ - verify Sourcegraph is running by temporarily making the frontend port accessible:

    ```sh
    $ kubectl port-forward svc/sourcegraph-frontend 3080:30080
    ```

-   _Log in_ - browse to your Sourcegraph deployment, login, and verify the instance is working as expected.

## Compare overlays

Below are the commands that will output the differences between the two overlays, allowing you to review and compare the changes and ensure that the new overlay produces similar resources as the ones currently being used by the active cluster or another overlay you want to compare with, before applying the new overlay.

### Between two overlays

To compare resources between two different Kustomize overlays:

```bash
$ diff \
    <(kubectl kustomize $PATH_TO_OVERLAY_1) \
    <(kubectl kustomize $PATH_TO_OVERLAY_2) |\
    more
```

Example 1: compare diff between resources generated by the k3s overlay for size xs instance and the k3s overlay for size xl instance:

```bash
$ diff \
    <(kubectl kustomize examples/k3s/xs) \
    <(kubectl kustomize examples/k3s/xl) |\
    more
```

Example 2: compare diff between the new base cluster and the old cluster:

```bash
$ diff \
    <(kubectl kustomize examples/base) \
    <(kubectl kustomize examples/old-cluster) |\
    more
```

Example 3: compare diff between the output files from two different overlay builds:

```bash
$ kubectl kustomize examples/old-cluster -o old-cluster.yaml
$ kubectl kustomize examples/base -o new-cluster.yaml

$ diff old-cluster.yaml new-cluster.yaml
```

### Between an overlay and a running cluster

To compare the difference between the manifests generated by an overlay and the resources that are being used by the running cluster connected to the kubectl tool:

```bash
$ kubectl kustomize $PATH_TO_OVERLAY | kubectl diff -f  -
```

The command will output the differences between the customizations specified in the overlay and the resources currently running in the cluster, allowing you to review the changes and ensure that the overlay produces similar resources to the ones currently being used by the active cluster before applying the new overlay.

Example: compare diff between the k3s overlay for size xl instance and the instance that is connected with `kubectl`:

```bash
$ kubectl kustomize examples/k3s/xl | kubectl diff -f  -
```

## List pods in cluster

List all pods in your cluster and the corresponding health status of each pod:

```bash
$ kubectl get pods -o=wide
```

## Tail logs for specific pod

Tail the logs for the specified pod:

```bash
$ kubectl logs -f $POD_NAME
```

If Sourcegraph is unavailable and the `sourcegraph-frontend-*` pod(s) are not in status `Running`, then view their logs with `$ kubectl logs -f sourcegraph-frontend-$POD_ID` (filling in `$POD_ID` from the `$ kubectl get pods` output). Inspect both the log messages printed at startup (at the beginning of the log output) and recent log messages.

## Retrieving resource information

Display detailed information about the status of a single pod:

```bash
$ kubectl describe $POD_NAME
```

List all Persistent Volume Claims (PVCs) and their statuses:

```bash
$ kubectl get pvc
```

List all Persistent Volumes (PVs) that have been provisioned.
In a healthy cluster, there should be a one-to-one mapping between PVs and PVCs:

```bash
$ kubectl get pv
```

List all events in the cluster's history:

```bash
$ kubectl get events
```

Delete failing pod so it gets recreated, possibly on a different node:

```bash
$ kubectl delete pod $POD_NAME
```

Remove all pods from a node and mark it as unschedulable to prevent new pods from arriving

```bash
$ kubectl drain --force --ignore-daemonsets --delete-local-data $NODE
```

Restarting Sourcegraph Instance:

```bash
$ kubectl rollout restart deployment sourcegraph-frontend
```

## Access the database

Get the id of one `pgsql` Pod:

```bash
$ kubectl get pods -l app=pgsql
NAME                     READY     STATUS    RESTARTS   AGE
pgsql-76a4bfcd64-rt4cn   2/2       Running   0          19m
```

Make sure you are operating under the correct namespace (i.e. add `-n prod` if your pod is under the `prod` namespace).

Open a PostgreSQL interactive terminal:

```bash
$ kubectl exec -it pgsql-76a4bfcd64-rt4cn -- psql -U sg
```

Run your SQL query:

```sql
SELECT * FROM users;
```

> NOTE: To execute an SQL query against the database without first creating an interactive session (as below), append `--command "SELECT * FROM users;"` to the docker container exec command.

## Backup and restore

The following instructions are specific to backing up and restoring the sourcegraph databases in a Kubernetes deployment. These do not apply to other deployment types.

> WARNING: **Only core data will be backed up**.
>
> These instructions will only back up core data including user accounts, configuration, repository-metadata, etc. Other data will be regenerated automatically:
>
> -   Repositories will be re-cloned
> -   Search indexes will be rebuilt from scratch
>
> The above may take a while if you have a lot of repositories. In the meantime, searches may be slow or return incomplete results. This process rarely takes longer than 6 hours and is usually **much** faster.

> NOTE: In some places you will see `$NAMESPACE` used. Add `-n $NAMESPACE` to commands if you are not using the default namespace
> More kubectl configuration options can be found here: [kubectl Cheat Sheet](https://kubernetes.io/docs/reference/kubectl/cheatsheet/)

### Back up Sourcegraph databases

These instructions will back up the primary `sourcegraph` database, the [codeintel](/code-navigation/) database, and the [codeinsights](/code-insights/) database.

A. Verify deployment running

```bash
$ kubectl get pods -A
```

B. Stop all connections to the database by removing the frontend deployment

```bash
$ kubectl scale --replicas=0 deployment/sourcegraph-frontend
# or
$ kubectl delete deployment sourcegraph-frontend
```

C. Check for corrupt database indexes. If amcheck returns errors, please reach out to [support@sourcegraph.com](mailto:support@sourcegraph.com)

```sql
create extension amcheck;

select bt_index_parent_check(c.oid, true), c.relname, c.relpages
from pg_index i
join pg_opclass op ON i.indclass[0] = op.oid
join pg_am am ON op.opcmethod = am.oid
join pg_class c ON i.indexrelid = c.oid
join pg_namespace n ON c.relnamespace = n.oid
where am.amname = 'btree'
-- Don't check temp tables, which may be from another session:
and c.relpersistence != 't'
-- Function may throw an error when this is omitted:
and i.indisready AND i.indisvalid;
```

D. Generate the database dumps

```bash
$ kubectl exec -it $pgsql_POD_NAME -- bash -c 'pg_dump -C --clean --if-exists --username sg sg' > sourcegraph_db.out
$ kubectl exec -it $codeintel-db_POD_NAME -- bash -c 'pg_dump -C --clean --if-exists --username sg sg' > codeintel_db.out
$ kubectl exec -it $codeinsights-db_POD_NAME -- bash -c 'pg_dump -C --clean --if-exists --username postgres postgres' > codeinsights_db.out
```

Ensure the `sourcegraph_db.out`, `codeintel_db.out` and `codeinsights_db.out` files are moved to a safe and secure location.

### Restore Sourcegraph databases

#### Restoring Sourcegraph databases into a new environment

The following instructions apply only if you are restoring your databases into a new deployment of Sourcegraph ie: a new virtual machine

If you are restoring a previously running environment, see the instructions for [restoring a previously running deployment](#restoring-sourcegraph-databases-into-an-existing-environment)

A. Copy the database dump files (eg. `sourcegraph_db.out`, `codeintel_db.out` and `codeinsights_db.out`) into your deployment directory

B. Start the database services by running the following command from your deployment directory:

```bash
$ kubectl rollout restart deployment pgsql
$ kubectl rollout restart deployment codeintel-db
$ kubectl rollout restart deployment codeinsights-db
```

C. Copy the database files into the pods by running the following commands:

```bash
$ kubectl cp sourcegraph_db.out $NAMESPACE/$pgsql_POD_NAME:/tmp/sourcegraph_db.out
$ kubectl cp codeintel_db.out $NAMESPACE/$codeintel-db_POD_NAME:/tmp/codeintel_db.out
$ kubectl cp codeinsights_db.out $NAMESPACE/$codeinsights-db_POD_NAME:/tmp/codeinsights_db.out
```

D. Restore the databases

```bash
$ kubectl exec -it $pgsql_POD_NAME -- bash -c 'psql -v ERROR_ON_STOP=1 --username sg -f /tmp/sourcegraph_db.out sg'
$ kubectl exec -it $codeintel-db_POD_NAME -- bash -c 'psql -v ERROR_ON_STOP=1 --username sg -f /tmp/condeintel_db.out sg'
$ kubectl exec -it $codeinsights-db_POD_NAME -- bash -c 'psql -v ERROR_ON_STOP=1 --username postgres -f /tmp/codeinsights_db.out postgres'
```

E. Check for corrupt database indexes. If amcheck returns errors, please reach out to [support@sourcegraph.com](mailto:support@sourcegraph.com)

```sql
create extension amcheck;

select bt_index_parent_check(c.oid, true), c.relname, c.relpages
from pg_index i
join pg_opclass op ON i.indclass[0] = op.oid
join pg_am am ON op.opcmethod = am.oid
join pg_class c ON i.indexrelid = c.oid
join pg_namespace n ON c.relnamespace = n.oid
where am.amname = 'btree'
-- Don't check temp tables, which may be from another session:
and c.relpersistence != 't'
-- Function may throw an error when this is omitted:
and i.indisready AND i.indisvalid;
```

F. Start the remaining Sourcegraph services by following the steps in [applying manifests](#applying-manifests).

#### Restoring Sourcegraph databases into an existing environment

A. Stop the existing deployment by removing the frontend deployment

```bash
$ kubectl scale --replicas=0 deployment/sourcegraph-frontend
# or
$ kubectl delete deployment sourcegraph-frontend
```

B. Remove any existing volumes for the databases in the existing deployment

```bash
$ kubectl delete pvc pgsql
$ kubectl delete pvc codeintel-db
$ kubectl delete pvc codeinsights-db
$ kubectl delete pv $pgsql_PV_NAME --force
$ kubectl delete pv $codeintel-db_PV_NAME --force
$ kubectl delete pv $codeinsights-db_PV_NAME --force
```

C. Copy the database dump files (eg. `sourcegraph_db.out`, `codeintel_db.out` and `codeinsights_db.out`) into your deployment directory

D. Start the database services only

```bash
$ kubectl rollout restart deployment pgsql
$ kubectl rollout restart deployment codeintel-db
$ kubectl rollout restart deployment codeinsights-db
```

E. Copy the database files into the pods by running the following commands:

```bash
$ kubectl cp sourcegraph_db.out $NAMESPACE/$pgsql_POD_NAME:/tmp/sourcegraph_db.out
$ kubectl cp codeintel_db.out $NAMESPACE/$codeintel-db_POD_NAME:/tmp/codeintel_db.out
$ kubectl cp codeinsights_db.out $NAMESPACE/$codeinsights-db_POD_NAME:/tmp/codeinsights_db.out
```

F. Restore the databases

```bash
$ kubectl exec -it $pgsql_POD_NAME -- bash -c 'psql -v ERROR_ON_STOP=1 --username sg -f /tmp/sourcegraph_db.out sg'
$ kubectl exec -it $codeintel-db_POD_NAME -- bash -c 'psql -v ERROR_ON_STOP=1 --username sg -f /tmp/condeintel_db.out sg'
$ kubectl exec -it $codeinsights-db_POD_NAME -- bash -c 'psql -v ERROR_ON_STOP=1 --username postgres -f /tmp/codeinsights_db.out postgres'
```

G. Check for corrupt database indexes. If amcheck returns errors, please reach out to [support@sourcegraph.com](mailto:support@sourcegraph.com)

```sql
create extension amcheck;

select bt_index_parent_check(c.oid, true), c.relname, c.relpages
from pg_index i
join pg_opclass op ON i.indclass[0] = op.oid
join pg_am am ON op.opcmethod = am.oid
join pg_class c ON i.indexrelid = c.oid
join pg_namespace n ON c.relnamespace = n.oid
where am.amname = 'btree'
-- Don't check temp tables, which may be from another session:
and c.relpersistence != 't'
-- Function may throw an error when this is omitted:
and i.indisready AND i.indisvalid;
```

H. Start the remaining Sourcegraph services by following the steps in [applying manifests](#applying-manifests).

## List of ports

To see a list of ports that are currently being used by your Sourcegraph instance:

```bash
$ kubectl get services
```

Example output:

```bash
NAME                            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                      AGE
blobstore                       ClusterIP   10.72.3.144    <none>        9000/TCP                     25h
cadvisor                        ClusterIP   10.72.14.130   <none>        48080/TCP                    23h
codeinsights-db                 ClusterIP   10.72.6.240    <none>        5432/TCP,9187/TCP            25h
codeintel-db                    ClusterIP   10.72.5.10     <none>        5432/TCP,9187/TCP            25h
gitserver                       ClusterIP   None           <none>        10811/TCP                    25h
grafana                         ClusterIP   10.72.6.245    <none>        30070/TCP                    25h
indexed-search                  ClusterIP   None           <none>        6070/TCP                     25h
indexed-search-indexer          ClusterIP   None           <none>        6072/TCP                     25h
kubernetes                      ClusterIP   10.72.0.1      <none>        443/TCP                      25h
node-exporter                   ClusterIP   10.72.5.60     <none>        9100/TCP                     25h
otel-collector                  ClusterIP   10.72.9.221    <none>        4317/TCP,4318/TCP,8888/TCP   25h
pgsql                           ClusterIP   10.72.6.23     <none>        5432/TCP,9187/TCP            25h
precise-code-intel-worker       ClusterIP   10.72.11.102   <none>        3188/TCP,6060/TCP            25h
prometheus                      ClusterIP   10.72.12.201   <none>        30090/TCP                    25h
redis-cache                     ClusterIP   10.72.15.138   <none>        6379/TCP,9121/TCP            25h
redis-store                     ClusterIP   10.72.4.162    <none>        6379/TCP,9121/TCP            25h
searcher                        ClusterIP   None           <none>        3181/TCP,6060/TCP            23h
sourcegraph-frontend            ClusterIP   10.72.12.103   <none>        30080/TCP,6060/TCP           25h
sourcegraph-frontend-internal   ClusterIP   10.72.9.155    <none>        80/TCP                       25h
syntect-server                  ClusterIP   10.72.14.49    <none>        9238/TCP,6060/TCP            25h
worker                          ClusterIP   10.72.7.72     <none>        3189/TCP,6060/TCP            25h
```

## Migrate to Kustomize

See the [migration docs for Kustomize](/self-hosted/deploy/kubernetes/kustomize/migrate) for more information.

## Upgrade

-   See the [Updating Sourcegraph docs](/self-hosted/updates/) on how to upgrade.<br/>
-   See the [Updating a Kubernetes Sourcegraph instance docs](/self-hosted/deploy/kubernetes/upgrade) for details on changes in each version to determine if manual migration steps are necessary.

## Troubleshoot

See the [Troubleshooting docs](/self-hosted/deploy/kubernetes/troubleshoot).
