Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traces get rejected if labels have dashes in them #3359

Open
kbolashev opened this issue Jun 17, 2024 · 3 comments
Open

Traces get rejected if labels have dashes in them #3359

kbolashev opened this issue Jun 17, 2024 · 3 comments

Comments

@kbolashev
Copy link

kbolashev commented Jun 17, 2024

Describe the bug

Hi Pyroscope team, thank you very much for this great tool!

We're trying to integrate it into our stack to profile some Go code, but hitting problems with ingesting goroutine and cpu traces.

Some of dependency code we're using that calls subprocesses for execution, sets pprof labels on goroutines to track these subprocesses (PID, etc.). These labels have dashes in them (e.g. process-description), which seems to conflict with Pyroscope's ingestion and we're failing to receive CPU and Goroutine traces (memory traces work fine).

I found the validation code, assuming I'm hitting this line. I'm wondering, if there's a way to skirt by this validation, or alternatively maybe sanitize these at pyroscope-go level.

To Reproduce

Steps to reproduce the behavior:

  1. Start Pyroscope (I have version 1.5.1 running in my local kind cluster right now) and have it be accessible on http://localhost:4040
  2. Clone the reproduction repo, build and run the package.

Expected behavior

All traces show up in pyroscope

Actual: the cpu and goroutine traces don't show up,

Environment

  • Infrastructure: Kubernetes (kind) with port-forwarding for local development.
  • Deployment tool: helm

Additional Context

Error in the stdout:

[ERROR] upload profile: failed to upload. server responded with statusCode: '422' and body: '{"code":"unknown","message":"pushing IngestInput-pprof failed invalid_argument: invalid labels '{__delta__=\"false\", __name__=\"goroutines\", hello-tag=\"world\", pyroscope_spy=\"gospy\", service_name=\"repro\"}' with error: invalid label name 'hello-tag'"}
'
[DEBUG] uploading at http://localhost:4040/ingest?aggregationType=sum&from=1718631497338591000&name=repro%7B__session_id__%3D528d9c54dd243618%7D&sampleRate=100&spyName=gospy&units=samples&until=1718631502391866000
[DEBUG] content type: multipart/form-data; boundary=2877c635eaa015fe8e86cf857751d53f590c439a0368025bf68c04d8c286
[ERROR] upload profile: failed to upload. server responded with statusCode: '422' and body: '{"code":"unknown","message":"pushing IngestInput-pprof failed invalid_argument: invalid labels '{__delta__=\"false\", __name__=\"process_cpu\", hello-tag=\"world\", pyroscope_spy=\"gospy\", service_name=\"repro\"}' with error: invalid label name 'hello-tag'"}
'
@kolesnikovae
Copy link
Collaborator

kolesnikovae commented Jun 18, 2024

Hi @kbolashev! Thank you for filing the issue.

We're currently using prometheus data model for labels (including those that are set as pprof tags) for consistency with other products:

Labels may contain ASCII letters, numbers, as well as underscores. They must match the regex [a-zA-Z_][a-zA-Z0-9_]*.

Recently, we've relaxed the requirement, by allowing . character (replaced to _) in #3335. This is an exception made specifically for OTel compatibility.

Before UTF-8 labels are fully supported in Prometheus, I'm a little hesitant to add more exceptions. Instead, I'd consider altering how we handle such labels: instead of discarding the profile, we could drop the invalid labels. What do you think?

I should note that the use of high cardinality labels will very likely cause performance issues. For example, if you set the PID of the spawn processes as a pprof label, you will get a new profile entry in the storage for each such process. If you wouldn't add this label to Prometheus metrics, you wouldn't want to add it as a pprof label either.

Some of dependency code we're using that calls subprocesses for execution, sets pprof labels on goroutines to track these subprocesses (PID, etc.). These labels have dashes in them (e.g. process-description), which seems to conflict with Pyroscope's ingestion and we're failing to receive CPU and Goroutine traces (memory traces work fine).

Out of curiosity, is this a publicly available package?

@kbolashev
Copy link
Author

Before UTF-8 labels are fully supported in Prometheus, I'm a little hesitant to add more exceptions. Instead, I'd consider altering how we handle such labels: instead of discarding the profile, we could drop the invalid labels. What do you think?

Honestly this would totally work for us, because being able to see the goroutines at all already gets much further and we don't really care about drilling down deeper for now.
Only problem I see is knowing that the labels actually got discarded.

Out of curiosity, is this a publicly available package?

Yes, sure! We're reusing some of Gitea's code for parsing git repos, and they seem to have their own in-house profiling. https://github.com/go-gitea/gitea/blob/main/modules/process/manager.go#L29

I should note that the use of high cardinality labels will very likely cause performance issues. For example, if you set the PID of the spawn processes as a pprof label, you will get a new profile entry in the storage for each such process. If you wouldn't add this label to Prometheus metrics, you wouldn't want to add it as a pprof label either.

Thanks for the warning on this! We'll keep track at how it performs and whether the performance degrades too much for us.
If the cardinality does turn out to be the problem, would you suggest trying to drop those labels altogether?

@kolesnikovae
Copy link
Collaborator

Thank you for the input, @kbolashev! We'll look into this soon (I'm sorry, I don't have a clear ETA).

If the cardinality does turn out to be the problem, would you suggest trying to drop those labels altogether?

Yeah, or scaling the Pyroscope deployment if performance degrades badly and the labels are really needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants