Monitoring
After a successful setup of a node, the next important step is to set up monitoring for the node. Monitoring is important to keep track of the node's health and performance.
JSON-RPC Endpoint
Test JSON RPC Interface
After the full node starts, you can test the JSON-RPC interfaces.
View Activity on Your Local Full Node with IOTA Explorer
The IOTA Explorer supports connecting to any network as long as it has https enabled. To view activity on your local full node:
- open the URL: https://explorer.iota.org/
- then select
Custom RPC URLfrom the network dropdown in the top right.
Fetch latest checkpoint using JSON-RPC
curl --json '{"jsonrpc":"2.0","id":1,"method":"iota_getLatestCheckpointSequenceNumber","params":[]}' localhost:9000 -s | jq .result
To ensure node health, you can check that this value matches the latest checkpoint known by the rest of the network by using the value found on https://explorer.iota.org/
Node Health Metrics
IOTA nodes expose a wide range of metrics to be scraped by Prometheus.
By default, metrics are available at the http://localhost:9184/metrics endpoint.
The best way to visualize these metrics is to use Grafana.
Additionally, a common approach is to use node exporter to scrape performance metrics from the node and push them to Prometheus.
Fetch key health metrics
Key health metrics via the /metrics HTTP endpoint:
curl -s localhost:9184/metrics | grep -E "^last_executed_checkpoint|^highest_synced_checkpoint|^highest_known_checkpoint|^last_committed_round|^consensus_threshold_clock_round|^highest_received_round|^consensus_proposed_blocks|^uptime"
For instance, for a validator node, the output would be:
consensus_proposed_blocks{force="false"} 247
consensus_proposed_blocks{force="true"} 1
consensus_threshold_clock_round 257
highest_known_checkpoint 555
highest_synced_checkpoint 886
last_executed_checkpoint 890
last_executed_checkpoint_age_bucket{le="0.001"} 0
last_executed_checkpoint_age_bucket{le="0.005"} 0
last_executed_checkpoint_age_bucket{le="0.01"} 0
...
last_executed_checkpoint_age_bucket{le="60"} 891
last_executed_checkpoint_age_bucket{le="90"} 891
last_executed_checkpoint_age_bucket{le="+Inf"} 891
last_executed_checkpoint_age_sum 156.52341099999992
last_executed_checkpoint_age_count 891
last_executed_checkpoint_timestamp_ms 1748335503888
uptime{chain_identifier="b5d7e5c8",is_docker="false",os_version="macOS 15.5 Sequoia",process="validator",version="1.1.0"} 196
Ensure node health using last checkpoint timestamp
To make sure your node runs properly, we check that the last processed checkpoint is recent enough:
- 10 seconds is typical
- 30 seconds is still fine
- You want to check that the timestamp difference stays under 1 minute
You can check that from the previous metric last_executed_checkpoint_timestamp_ms, and compare timestamps with now using this command:
last_executed_checkpoint_timestamp_ms="$(curl -s localhost:9184/metrics | grep ^last_executed_checkpoint_timestamp_ms | awk '{print $2}')"
now_timestamp="$(date +%s%3N)"
if (( now_timestamp - last_executed_checkpoint_timestamp_ms < 60000 )); then
echo "[OK] healthy & in sync"
else
echo "[ERROR] Node unhealthy. Last known checkpoint is too old."
fi
Monitor consensus sync status
To ensure your node's consensus module is properly synced with the network, monitor the difference between consensus_commit_sync_local_index and consensus_commit_sync_quorum_index:
metrics="$(curl -s localhost:9184/metrics)"
local_index="$(echo "$metrics" | grep ^consensus_commit_sync_local_index | awk '{print $2}')"
quorum_index="$(echo "$metrics" | grep ^consensus_commit_sync_quorum_index | awk '{print $2}')"
difference=$((quorum_index - local_index))
if (( difference > 100 )); then
echo "[WARNING] Consensus module not in sync. Difference: $difference"
echo "[INFO] Monitor this difference over time:"
echo " - If growing: Node falling behind network"
echo " - If shrinking: Node catching up to network"
else
echo "[OK] Consensus module in sync. Difference: $difference"
fi
Key indicators:
- Difference > 100: Node's consensus module is not in sync
- Growing difference: Network is advancing faster than node is syncing
- Shrinking difference: Node is correctly syncing and catching up
Monitor skipped proposals
Check the consensus_core_skipped_proposals metric to detect issues with block production:
metrics="$(curl -s curl -s localhost:9184/metrics)"
skipped_proposals="$(echo "$metrics" | grep ^consensus_core_skipped_proposals)"
if [[ -z "$skipped_proposals" ]]; then
echo "[OK] No skipped proposals metric found"
else
echo "Skipped proposals by reason:"
echo "$skipped_proposals"
echo ""
echo "Monitor these values over time. Growing counts indicate issues:"
echo " - no_quorum_subscriber: Insufficient connections (< 2f+1 stake) to produce blocks"
echo " - Other reasons: Check the specific label for the cause"
fi
Key indicators:
- Growing
no_quorum_subscriber: Insufficient validator connections (less than 2f+1 stake connected). Node cannot safely produce new blocks - Growing other labels: The corresponding label value indicates the specific cause of skipped proposals
- Run the script periodically to compare values and identify trends
Overall node health
When all the above healthchecks pass, your node should be considered healthy:
- ✅ Last checkpoint timestamp is recent (< 60 seconds)
- ✅ Consensus sync difference is acceptable (< 100)
- ✅ Validator connections are sufficient (≥ 80% of committee)
- ✅ Connection symmetry is maintained
Logs
Configuring Logs
Log level (error, warn, info, trace) is controlled using the RUST_LOG environment variable.
The RUST_LOG_JSON=1 environment variable can optionally be set to enable logging in JSON structured format.
Depending on your deployment method, these are configured in the following places:
- Systemd
- Docker Compose
[Service]
...
Environment=RUST_BACKTRACE=1
Environment=RUST_LOG=info,iota_core=debug,consensus=debug,jsonrpsee=error
Add the following to the node container settings:
environment:
- RUST_BACKTRACE=1
- RUST_LOG=info,iota_core=debug,consensus=debug,jsonrpsee=error
It is possible to change the logging configuration while a node is running using the admin interface.
Verify Configured Logging Values
- Systemd
- Docker Compose
To view the currently configured logging values:
curl -w "\n" localhost:1337/logging
To change the currently configured logging values:
curl localhost:1337/logging -d "info"
Note that the admin port (1337) is only exposed to localhost by default, so the commands must be executed inside the container.
To view the currently configured logging values:
docker exec <FULLNODE_CONTAINER_NAME> curl -w "\n" localhost:1337/logging
To change the currently configured logging values:
docker exec <FULLNODE_CONTAINER_NAME> curl localhost:1337/logging -d "info"
Replace <FULLNODE_CONTAINER_NAME> with your actual container name, such as iota-fullnode-docker-setup-fullnode-1.
Viewing Logs
- Systemd
- Docker Compose
To view and follow the IOTA node logs:
journalctl -u iota-node -f
To search for a particular match:
$ journalctl -u iota-node -g <SEARCH_TERM>
View and follow:
sudo docker compose logs -f [node_container_name]
By default, all logs are output. Limit this using --since:
sudo docker logs --since 10m -f [node_container_name]
Monitoring Services
Implementing monitoring services is essential to ensure the reliability, security, and performance of the blockchain network by providing real-time insights, detecting anomalies, enabling proactive issue resolution, and receiving automatic alerts.
Prometheus and Grafana (recommended)
Example pre-made dashboards you can use:
- Community-owned grafana dashboard: https://github.com/stakeme-team/grafana-iota.
- Grafana setup for the local private network, which might be a good example of how to build your own setup.
- Officially supported dashboards can be found here in the IOTA repository.
Dolphin
Dolphin is a CLI tool that provides high-level features for validator and fullnode monitoring. Under the hood, it uses the IOTA node Prometheus metric exporter to check the health of the node. More info: https://gitlab.com/blockscope-net/dolphin-v2.