The way that I think of a tracer token is a synthetic transaction that keeps track of when it makes it to each of the publisher, distributor, and subscriber. For instance, I can post a token today at 10:00:00 and it will tell me that it made it to the distributor at 10:00:03, and finally to the subscriber at 10:00:05. Moreover, the distribution database keeps track of all tokens posted historically (up to the history retention for the distributor).
How can we use this for monitoring? Well, the first step in doing this is to automate the posting of the tracer tokens. Most people use the "Post tracer token" button to introduce ad hoc tokens into the system. That isn't great for monitoring a system that's already behind as the actual problem may have started a while ago! Fortunately, the solution is simple: schedule a SQL Agent job to execute sp_posttracertoken. I leave it as an exercise to the reader to write a job that loops through all of the publications in the database and post a token for each one. Keep in mind that how often you schedule this to run is the granularity of your monitoring. So choose a frequency that's right for your environment.
We also need an easy way to read the data. For this, I wrote a couple of views. The first one just gets the raw data out of the tokens and presents it in an easy-to-read format.
USE [distribution] GO IF EXISTS (SELECT * FROM sys.views WHERE object_id = OBJECT_ID(N'[dbo].[tokens]')) DROP VIEW [dbo].[tokens] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO create view [dbo].[tokens] as select ps.name as [publisher], p.publisher_db, p.publication, ss.name as [subscriber], da.subscriber_db, t.publisher_commit, t.distributor_commit, h.subscriber_commit, datediff(second, t.publisher_commit, t.distributor_commit) as [pub to dist (s)], datediff(second, t.distributor_commit ,h.subscriber_commit) as [dist to sub (s)], datediff(second, t.publisher_commit, h.subscriber_commit) as [total latency (s)] from mstracer_tokens t inner join MStracer_history h on t.tracer_id = h.parent_tracer_id inner join mspublications p on p.publication_id = t.publication_id inner join sys.servers ps on p.publisher_id = ps.server_id inner join msdistribution_agents da on h.agent_id = da.id inner join sys.servers ss on da.subscriber_id = ss.server_id
The second uses the first to find (for each publication that this distributor services) the latest token that has made it all the way to the subscriber.
I like to set up alerting off of this second view based on the publisher_commit column (e.g. "select * from replication_lag where publisher_commit < dateadd(minute, -30, getdate())"). If the newest token to make it to the subscriber is "old" (for whatever definition of "old" you'd like to use), raise an alert.
USE [distribution] GO IF EXISTS (SELECT * FROM sys.views WHERE object_id = OBJECT_ID(N'[dbo].[repllication_lag]')) DROP VIEW [dbo].[replication_lag] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO create view [dbo].[replication_lag] as with cte as ( select *, row_number() over (partition by [publisher], [publisher_db], [publication], [subscriber] order by [publisher_commit] desc) as [r] from [dbo].[tokens] where subscriber_commit is not null ) select * from cte where [r] = 1