Commit 08d811bf authored by Amy Qualls's avatar Amy Qualls

Add section for alert and incident mgmt

Start by building this section.
parent a34a8765
......@@ -9,14 +9,89 @@ info: To determine the technical writer assigned to the Stage/Group associated w
GitLab provides a variety of tools to help operate and maintain
your applications:
## Measure reliability and stability with metrics
Metrics help you understand the health and performance of your infrastructure,
applications, and systems by providing insights into your application's reliability,
stability, and performance. GitLab provides a dashboard out-of-the-box, which you
can extend with custom metrics, and augment with additional custom dashboards. You
can track the metrics that matter most to your team, generate automated alerts when
performance degrades, and manage those alerts - all within GitLab.
- Collect [Prometheus metrics](../user/project/integrations/prometheus_library/index.md).
- Monitor application status with the [out-of-the-box metrics dashboard](metrics/index.md),
which you can [customize](metrics/dashboards/settings.md).
- Create [custom performance alerts](metrics/alerts.md).
- Create [custom metrics](metrics/index.md#adding-custom-metrics) and
[custom dashboards](metrics/dashboards/index.md).
## Manage alerts and incidents
GitLab helps reduce alert fatigue for IT responders by providing tools to identify
issues across multiple systems and aggregate alerts in a centralized place. Your
team needs a single, central interface where they can easily investigate alerts
using metrics and logs, and promote the critical alerts to incidents.
Are your alerts too noisy? Alerts configured on GitLab metrics can configured
and fine-tuned in GitLab immediately following a fire-fight.
- [Manage your external alerts](../user/project/operations/alert_management.md) and [manage Incidents](../user/incident_management/index.md) in GitLab.
- [Configure alerts for metrics](metrics/alerts.md#set-up-alerts-for-prometheus-metrics-core) in GitLab.
- Create a [status page](incident_management/status_page.md)
to communicate efficiently to your users during an incident.
## Track errors in your application
GitLab integrates with [Sentry](https://sentry.io/welcome/) to aggregate errors
from your application and surface them in the GitLab UI with the sorting and filtering
features you need to help identify which errors are the most critical. Through the
entire triage process, your users can create GitLab issues to track critical errors
and the work required to fix them - all without leaving GitLab.
- Discover and view errors generated by your applications with
[Error Tracking](error_tracking.md).
## Trace application health and performance **(ULTIMATE)**
Application tracing in GitLab is a way to measure an application's performance and
health while it's running. After configuring your application to enable tracing, you
gain in-depth insight into your application's layers. With application tracing,
you can measure the execution time of a user journey for troubleshooting or
optimization purposes.
GitLab integrates with [Jaeger](https://www.jaegertracing.io/) - an open-source,
end-to-end distributed tracing system tool used for monitoring and troubleshooting
microservices-based distributed systems - and displays results within GitLab.
- [Trace the performance and health](tracing.md) of a deployed application. **(ULTIMATE)**
## Aggregate and store logs
Developers need to troubleshoot application changes in development, and incident
responders need aggregated, real-time logs when troubleshooting problems with
production services. GitLab provides centralized, aggregated log storage for your
distributed application, enabling you to collect logs across multiple services and
infrastructure.
- [View logs of pods or managed applications](../user/project/clusters/kubernetes_pod_logs.md)
in connected Kubernetes clusters.
## Manage your infrastructure in code
GitLab integrates with [Terraform](https://www.terraform.io/), uniting your GitOps and
Infrastructure-as-Code (IaC) workflows with GitLab's authentication, authorization,
and user interface. By lowering the barrier to entry for adopting Terraform, you
can manage and provision infrastructure through machine-readable definition files,
rather than physical hardware configuration or interactive configuration tools.
Definitions are stored in version control, extending proven coding techniques to
your infrastructure, and blurring the line between what is an application and what is
an environment.
- Learn how to [manage your infrastructure with GitLab and Terraform](../user/infrastructure/index.md).
## More features
- Deploy to different [environments](../ci/environments/index.md).
- Manage your [Alerts](../user/project/operations/alert_management.md) and [Incidents](../user/incident_management/index.md).
- Connect your project to a [Kubernetes cluster](../user/project/clusters/index.md).
- Manage your infrastructure with [Infrastructure as Code](../user/infrastructure/index.md) approaches.
- Discover and view errors generated by your applications with [Error Tracking](error_tracking.md).
- Handle incidents in your applications and services with [Incident Management](incident_management/index.md).
- See how your application is used and analyze events with [Product Analytics](product_analytics.md).
- Create, toggle, and remove [Feature Flags](feature_flags.md). **(PREMIUM)**
- [Trace](tracing.md) the performance and health of a deployed application. **(ULTIMATE)**
- Change the [settings of the Monitoring Dashboard](metrics/dashboards/settings.md).
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment