First one is related to `OpenTelemetry.Ctx` API. I've noticed in a few
scenarios the current span of a trace may get lost after Ecto calls.
Looking at the The `attach/1` typespec, it's a Ctx -> Token, while
`dettach/1` as Token -> Ctx function. That made me assume the expected
input of dettach is the return type of attach. Indeed, after this change
we got the behavior of Ecto calls preserve the parent span untouched.
That leads to a second bug found. When ecto does simple calls within a
Task, due the special propagation code for preloads that means it will
skip the current span, if any. The solution here is to first check the
current process.
One test was added to reproduce this bug.
* Fix CI errors, update GHA deps, update versions
* output syntax
* remove OTP 22 tests
* set concurrency to cancel in progress
* whitespace
* incompatible vsns and failed test
* Try pulling excludes out
* Escaping
* Just drop < 1.13 until this can move to workflows
* quote
* fix remove extra bracket in mix.exs
* use capture log for less verbose test output
* add span_name opt for overriding span name
* add moduledoc
* allow function for span_name opt
Instead of custom attributes, leverage the status description as
described in Semantic Conventions. This approach is taken from current
`opentelemetry_ecto` implementation.
Small non-related change is a fix the license description in `mix.exs`.
Default exporter immediately attempts on start connect to
`:otel-collector` default port. As we don't have any collector running
on our test environment, this results in a few warnings. That's not an
issue in itself, as code immediately switches to another export, but
creates a lot of noise.
This patch moves the exporter setup to `config/test.exs`, essentially
removing the need to restart opentelemetry applicationn for each test
case. The only work setup blocks do is update the exporter's target pid.
The processor was changed to simple mode, available now, which also remove
another vector of (unlikely but theoretically possible) race-conditions.
* Process propagator library
* Fix Elixir API
* CI files
* Update propagators/opentelemetry_process_propagator/lib/opentelemetry_process_propagator.ex
Co-authored-by: Andrew Rosa <dev@andrewhr.io>
* Update propagators/opentelemetry_process_propagator/lib/opentelemetry_process_propagator.ex
Co-authored-by: Andrew Rosa <dev@andrewhr.io>
* format
Co-authored-by: Andrew Rosa <dev@andrewhr.io>
* Handle binary resp_status from Cowboy and rename http.status to http.status_code
* Fix test to use http.status_code as well
* Handle resp_status could be undefined but not error
- This could be due to websocket upgrade request.
* Rename Status1 and transform_status to a more concise naming
* Add test case for handling binary response code
* Fix syntax and failing tests
* Always convert cowboy status to status code
* Set otel span status as error when status code >= 500
There is an edge case, if you use `forward/4` and use Plug.ErrorHandler,
then when an exception reaches the outer router, then Plug.send_resp
will be called, triggering `[:phoenix, :endpoint, :stop]`, and the span
will be gone by the time the outer router gets the exception. This
causes this telemetry handler to crash and be detached.
Sequence of events:
- [:phoenix, :endpoint, :start]
- [:phoenix, :router_dispatch, :exception] (inner router)
- [:phoenix, :endpoint, :stop]
- [:phoenix, :router_dispatch, :exception] (outer router) ** here there is no span, crashes
* Boostrap Phoenix application from mix phx.new
* Add opentelemetry dependencies
* Setup opentelemetry for local env
* Setup Dockerfile, docker-compose and otel config
* Configure runtime config for exporter to export to otel collector in prod env
* Generate Posts HTML resources
* Add Release module to run migration in release
* Generate Users LiveView resource
* Add exporter configuration to export directly to external service
* Update README.md to include description and instructions
* Update README.md to include more details on exporting traces
* Fix otlp collector deprecated ports as suggested
Initial approach follows Ecto instrumentation, recording spans for all
Redix `[:redix, :pipeline, :stop]` events.
The command sanitization is inspired-by and adapted from [Java
instrumentation][1], from where I've also copied the actual commands and
what configuration should they follow.
Network attributes are tracked via a "sidecar" process, which keeps
track of connection attributes also via `telemetry`. This extra bit of
bookkeeping is needed as command events doesn't include that piece of
information, unfortunately.
[1]: b2bc41453b/instrumentation-api/src/main/java/io/opentelemetry/instrumentation/api/db/RedisCommandSanitizer.java
By default a new trace is automatically started when a job is processed
by monitoring these events:
* `[:oban, :job, :start]` — at the point a job is fetched from the database and will execute
* `[:oban, :job, :stop]` — after a job succeeds and the success is recorded in the database
* `[:oban, :job, :exception]` — after a job fails and the failure is recorded in the database
To also record a span when a job is created and to link traces together
`Oban.insert/2` has to be replaced by `OpentelemetryOban.insert/2`.
Before:
```elixir
%{id: 1, in_the: "business", of_doing: "business"}
|> MyApp.Business.new()
|> Oban.insert()
```
After:
```elixir
%{id: 1, in_the: "business", of_doing: "business"}
|> MyApp.Business.new()
|> OpentelemetryOban.insert()
```
Co-authored-by: Tristan Sloughter <t@crashfast.com>