IO and HTTP¶
yggdrasil.io is the transport surface — buffers, URLs, requests/responses, codecs, media types, sessions, pagination, caching. Use it instead of reaching directly for urllib3/requests/io.
Preferred HTTP client: HTTPSession¶
from yggdrasil.io.http_ import HTTPSession
http = HTTPSession()
print(http.get("https://httpbin.org/get").json())
Verbs¶
http.get("https://httpbin.org/get")
http.post("https://httpbin.org/post", json={"name": "alice"})
http.put("https://httpbin.org/put", json={"enabled": True})
http.patch("https://httpbin.org/patch", json={"op": "replace"})
http.delete("https://httpbin.org/delete")
Headers and auth¶
http = HTTPSession(x_api_key="my-api-key")
http.get("https://httpbin.org/headers", headers={"x-trace-id": "run-001"})
Strict status¶
Prepared request workflow¶
Useful when you need to inspect, mutate, or sign a request before it goes out:
prepared = http.prepare_request(
method="POST",
url="https://httpbin.org/post",
json={"event": "order_created", "id": 123},
headers={"x-source": "ygg-docs"},
)
resp = http.send(prepared)
print(resp.status, resp.json()["json"])
Batch / parallel dispatch¶
from yggdrasil.io import SendManyConfig
reqs = [
http.prepare_request("GET", "https://httpbin.org/get", params={"idx": i})
for i in range(10)
]
cfg = SendManyConfig(max_workers=5)
responses = list(http.send_many(reqs, send_config=cfg))
print([r.status for r in responses])
Response → analytics formats¶
If the server returns tabular JSON / Arrow, project straight into your engine:
resp = http.get("https://api.example.com/v1/orders?format=arrow")
resp.to_arrow_table()
resp.to_pandas()
resp.to_polars()
resp.to_spark()
For free-form JSON:
Resilient paged pull (recipe)¶
from yggdrasil.io.http_ import HTTPSession
from yggdrasil.io import SendManyConfig
http = HTTPSession()
# Stage 1: fetch pages concurrently.
reqs = [http.prepare_request("GET", "https://httpbin.org/get", params={"page": p})
for p in range(1, 6)]
responses = list(http.send_many(reqs, send_config=SendManyConfig(max_workers=3)))
# Stage 2: normalize.
rows = []
for r in responses:
payload = r.json()
rows.append({"page": payload.get("args", {}).get("page"),
"url": payload.get("url")})
print(rows)
Buffers — BytesIO¶
yggdrasil.io.BytesIO is a spill-to-disk byte buffer with media/compression detection:
from yggdrasil.io import BytesIO
with BytesIO() as buf:
buf.write(b"hello")
buf.seek(0)
print(buf.compression)
print(buf.media_type)
Buffer changes anywhere in the codebase must preserve spill-to-disk behavior, codec handling, cursor safety, and Arrow/Parquet/JSON/IPC compatibility.
URL parsing and composition¶
from yggdrasil.io import URL
u = URL.from_str("https://example.com/a/b?q=1")
print(u.host, u.path)
print(u.with_query_items({"q": 2, "lang": "en"}).to_string())
URL is immutable. Mutate via with_* methods that return a new instance.
Legacy retry-only session¶
yggdrasil.requests.YGGSession exists for back-compat. Use HTTPSession for new code.
from yggdrasil.requests import YGGSession
legacy = YGGSession(num_retry=3)
print(legacy.get("https://example.com", timeout=10).status_code)
There's also yggdrasil.requests.MSALSession for Azure scenarios.
Observability fields¶
Tooling downstream relies on the request/response models preserving:
- normalized URL parts,
- promoted/remaining headers,
- body bytes,
- payload hashes,
- timestamps,
- status / timing.
If you write a wrapper, keep these fields populated.