blob: 10dd54843da37ba5065c7b01d1898308be47db45 [file] [log] [blame]
TODOs
=====
WIP
---
merged in mm
- Handle non-power-of-two min_nr_regions aligning.
- ALIGN() is only for power of two alignment.
- merged in mm.git
- per-filtered-address-ragne prioritization
- do the temperature-prioritization for address filter-passed regions,
separately
- merged in mm.git
- respect min_nr_regions from beginning
- merged in mm.git
- fix sampling intervals overflow
- add CONFIG_DAMON_HARDENED for integrity checking
- allow stop-after-achieving DAMOS goal
- add per-goal quota tuning strategy
- consist: the current one, default.
- temporal: the new one.
- bpf: for future?
- fix walk_system_ram() type violation
- https://lore.kernel.org/20260129161029.48991-1-sj@kernel.org
- mark DAMOS filters/ dir deprecated on doc
- cover all physical address space from DAMON modules
- add kdamond_pid to damon_stat.
- let DAMON_RECLAIM auto-tune intervals.
- kdamond pause/resume.
- Use it for drgn tests.
- Let DAMOS action failed region be charged in a different rate
- e.g., charge only 1 byte for 1 MiB of failed region.
- Documentation cleanup: Make link from design to API
- deprecate sysfs filters/ in favor of {core,ops}_filters/
- warn on use of filters/, by 2027
- rename filters/ to filters_DEPRECATED/, by 2028
- remove filters/ code and documentation, by 2029
Planning / Considering
----------------------
- Aware 1 GiB hugetlb pages when auto-tuning intervals.
- Subtract 1 GiB hugetlb pages from the total size of memory.
- Let DAMON_STAT and DAMON_RECLAIM run in parallel.
- Let DAMON API callers share a kdamond
- What kdamond to share?
- A single one that always running, say, kdamond*<N>?
- Just the kdamond of the first caller that started (could be any
kdamond<N>?
- counters of interests
- for production level lightweight page-level monitoring
- let users register counters of interests
- when doing access check, check if the sample page is of the interest,
increase the counter.
- users know how many of each region is of the type, proportionally.
- nr_accesses will be the default counter
- add drgn selftests for DAMON modules.
- Support tlb flushing
- https://lore.kernel.org/all/a2fb10bd-b44a-350e-f693-82ecfa6f54a8@huawei.com/
- automatic paddr regions detection
(handle hot(un)plugged memory regions)
- similar to vaddr, detect and cover all online memory as much as possible,
with a few holes.
- make DAMON_STAT cover multiple hot-[un]pluggable NUMA nodes
- implement and use kind of three big regions detection mechanism
- distinction of uncheckable memory
- not just say it is not accessed, but disclose the fact that it was unable
to check the access to, and apply merge-split with the fact.
- add kunit test for all parameters commit functions
- add damos_stat for auto-tuned zero size applying
- DAMOS_KILL: DAMON-based OOM killing
- If DAMON_RECLAIM doesn't increase free memory, killing processes of hot
pages.
- implement pause/resume of kdamonds
- Or, let user feed nr_accesses and age of regions
- User can snapshot the last monitoring results and continue monitoring
from the status.
- support (multiple) kdamosd
- per-noe memory bandwidth utilization DAMOS quota goal metric
- clarify possible monitoring results loss on usage doc.
- implement addr_unit commit kunit test
- implement addr_unit to/from core address conversion kunit test
- handle charged_ns overflow on 32bit machines
- cleanup damon_set_attrs() documentation
- add kunit test for damon_call() and damos_walk()
- DAMON-based khugepaged
- Make khugepaged listen DAMON's voice when collapsing
- Support multi contexts per kdamond
- Add sharable kdamond
- Run with auto-tuned intervals
- Let API callers report access information, read monitoring results and
add/remove DAMOS schemes
- Major callers would be DAMON modules
- damon_get_handle(), damon_report_access(), damos_add_folios()
- Let ABI users read monitoring results and DAMOS stat
- /sys/kernel/mm/damon/???/
- DAMON_NUMA_MIGRATE
- Aim for not only traditional NUMA but also tiering
- Just extend mtier to every node, but CPU-aware.
- sysfs command for updating params (for restoring old params)
- write a selftest for sz_filter_passed
- contigurized-vddr
- trat end of a discrete region as start of next region
- connect regions before and after huge unmapped area
- connect regions of different virtual addresses
- Write API documentation
- no just kerenl doc, but more structured document is needed
- Let users decide regions split factor
- https://lore.kernel.org/20241026215311.148363-1-sj@kernel.org
- Let users periodically split regions without per-region subregions limit
- https://lore.kernel.org/20241026215311.148363-1-sj@kernel.org
- Sampling based page level properties based monitoring
- For DAMOS_STAT, do sampling for sz_filter_passed calculation
- Let user sets the number of samples per region for this
- Support reserved uninstall of DAMOS
- Allow running DAMOS scheme for only specific apply intervals
- Extend for memory bandwidth monitoring
- Extend for AMD IBS-based monitoring
- Extend for cache-set space monitoring
- Extend for cache-line space monitoring
- Require sub-page level monitoring (IBS?)
- Access/Contiguity-aware Memory Auto-scaling
- https://lore.kernel.org/damon/20240512193657.79298-1-sj@kernel.org
- Support cleaning sysfs input files up to committed values
- holistic heterogenous memory management
- address CPU-numa, CXL-numa, and device(e.g., GPU)-numa nodes
- DAMON_LRU_SORT auto-tuning
- Let auto-tuning using active/inactive memory ratio
- Selftests: Test DAMON online tuning
- Selftests: Test DAMOS online tuning
- Selftests: Test DAMOS filter
- Make 'age' counted by sample_interval rather than aggregation interval
- CPU time quota for DAMON monitirng
- monitoring part CPU usage statistics
- Setting resolution of damos tried_regions
- Should be able to control the directories population overhead when the
number of regions is big
- More DAMON modules
- DAMON-based THP hinting module
- rename nr_accesses/moving_accesses_bp
- mark nr_accesses as private
- use a dedicated struct for access rate
- let kdamond name be user-defined
- let DAMON modules share one kdamond
- unify DAMON modules
- support multiple contexts per kdamond
- DAMON-based VMA split/merge
- Help big VMA contention issue?
- We can further expose the monitoring results via vma name
- Reading results becomes very easy
- Contig memory access util monitoring
- WSS/RSS based processes sorting
- LRU-based monitoring ops
- Fixed granularity idleness monitoring
- Must be useful for further DAMON overhead/accuracy evaluation
- Improve regions-based monitoring quality
- Support cgroups
- Add __counted_by() annotation when ready
(https://lore.kernel.org/r/CAKwvOdkvGTGiWzqEFq=kzqvxSYP5vUj3g9Z-=MZSQROzzSa_dg@mail.gmail.com)
- Ideas from LSFMM
- Add operations driven access check
- Let it calls DAMON functions for noticed access and then let DAMON records
the access
- Take care of fairness on ACMA (e.g., NUMA)
- Consider hugetlb handling optimization
- DAMON_RECLAIM will meaninglessly try reclaiming hugetlb pages, consume
CPUs. Find a way to optimize.
- DAMON in process context
- Do the monitoring for each process in task_work context, like NUMA
balancing installs prot_none.
Frozen
------
Recently Done
-------------
- DAMOS stat/control improvement
- Show how many snapshots the scheme has processed
- Let DAMOS deactivated based on the number of processed snapshots
- Provide a tracepoint for DAMOS stat
- merged into 7.0-rc1
- DAMON_LRU_SORT modernization
- merged into 7.0-rc1
- hide kdamond and kdamond_lock from API callers
- merged into 7.0-rc1
- modernize wss_estimation kselftest
- merged into 7.0-rc1
- document modules usage on usage doc
- merged into 7.0-rc1
Non-DAMON issues
----------------