Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
J
joohanhong
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
JooHan Hong
joohanhong
Commits
793e61b2
Commit
793e61b2
authored
Mar 30, 2021
by
JooHan Hong
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
node down 1차완료
parent
9db2a5b4
Pipeline
#5315
passed with stages
in 47 seconds
Changes
4
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
30 additions
and
8 deletions
+30
-8
README.md
DOCKER/PROMETHEUS/ITEM/NODES/README.md
+1
-1
README.md
DOCKER/PROMETHEUS/ITEM/NODES/RESULT/DOWN/README.md
+29
-7
node_down_alert.png
DOCKER/PROMETHEUS/ITEM/NODES/images/node_down_alert.png
+0
-0
node_down_resolved.png
DOCKER/PROMETHEUS/ITEM/NODES/images/node_down_resolved.png
+0
-0
No files found.
DOCKER/PROMETHEUS/ITEM/NODES/README.md
View file @
793e61b2
...
...
@@ -7,7 +7,7 @@
| NO | ITEM | 바로가기 | 비고 |
| ------ | ------ | ------ | ------ |
| 1 | Node
(Host)
**Down**
|
[
GO
](
./RESULT/DOWN/
)
| |
| 1 | Node
-Exporter
**Down**
|
[
GO
](
./RESULT/DOWN/
)
| |
| 2 | Node CPU
**Steal**
|
[
GO
](
./RESULT/STEAL/
)
| |
| 3 | Node
**HIGH CPU**
Load |
[
GO
](
./RESULT/CPULOAD/
)
| |
| 4 | Node CPU Context Switching
**Heavy**
|
[
GO
](
./RESULT/CONTEXTSWITCHING/
)
| |
...
...
DOCKER/PROMETHEUS/ITEM/NODES/RESULT/DOWN/README.md
View file @
793e61b2
[
![logo
](
https://www.hongsnet.net/images/logo.gif
)
](https://www.hongsnet.net)
# Node
(Host) Resource
-> Node Down 검증
# Node
-Exporter
-> Node Down 검증
> Node 장애 시 Alert을 수신하도록 구성하고 검증한다.
> Node
-Exporter
장애 시 Alert을 수신하도록 구성하고 검증한다.
# Configuration
3분 동안 Node
의
`Down`
이 감지된 경우 Alert을 발생시키는 Rule
1분 동안 Node-Exporter
의
`Down`
이 감지된 경우 Alert을 발생시키는 Rule
-
**결과**
...
...
@@ -38,14 +38,14 @@ data:
groups
:
- name: Node
(
Host
)
Down alerts
rules:
- alert: Node
d
own
expr
: up
{
job
=
"node_exporter"
}
==
0
for
:
3
m
- alert: Node
-Exporter D
own
expr
: up
{
job
=
"
ext-
node_exporter"
}
==
0
for
:
1
m
labels:
severity: fatal
annotations:
title:
"Node {{
$labels
.instance }} is down"
description:
"Failed to scrape {{
$labels
.job }} on {{
$labels
.instance }} for more than
3
minutes. Node seems down."
description:
"Failed to scrape {{
$labels
.job }} on {{
$labels
.instance }} for more than
1
minutes. Node seems down."
...하략
```
...
...
@@ -60,4 +60,26 @@ data:
# Verify
*
[
**STEP 1**
]
: Node-Exporter를 강제로 종료한다.
```
bash
# systemctl stop node_exporter.service
```
*
[
**STEP 2**
]
: Alert Manager를 통한 Alert 발송을 확인한다.
![
node_down_alert
](
../../images/node_down_verify.png
)
*
[
**STEP 3**
]
: Resolved를 검증하기 위해 종료중인 Node-Exporter를 재실행 한다.
```
bash
# systemctl start node_exporter.service
```
*
[
**STEP 4**
]
: Alert Manager를 통한
`Resolved`
Alert 발송을 확인한다.
![
node_down_alert
](
../../images/node_down_resolved.png
)
DOCKER/PROMETHEUS/ITEM/NODES/images/node_down_alert.png
0 → 100644
View file @
793e61b2
28.2 KB
DOCKER/PROMETHEUS/ITEM/NODES/images/node_down_resolved.png
0 → 100644
View file @
793e61b2
22.7 KB
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment