external/uBlock - Forgejo: Beyond coding. We forge.

external/uBlock

mirror of https://github.com/gorhill/uBlock.git synced 2024-11-11 17:41:03 +01:00

Author	SHA1	Message	Date
Raymond Hill	0ca44b847c	Avoid duplicated strings in filterOrigin w/ new approach The new approach is simpler and should benefit selfie serialization/unserialization. This renders stringDeduplicater obsolete -- it has been removed.	2019-05-17 10:13:58 -04:00
Raymond Hill	c4f9ae706a	Fix alternate code path introduced in `295f08da97` (oops)	2019-04-28 14:18:09 -04:00
Raymond Hill	295f08da97	Implement code path for when TextDecoder() is not available The primary purpose is to unbreak https://github.com/cliqz-oss/adblocker/tree/master/bench/comparison	2019-04-28 14:07:21 -04:00
Raymond Hill	ac58b8e688	Make token hashes fit within a 32-bit integer The staticNetFilteringEngine uses token hashes to store/lookup filters into Map objects. Before this commit, the tokens were encoded into token hashes as JS numbers (not exceeding MAX_SAFE_INTEGER) using at most the 8 first characters of the token. With this commit, token hashes are now restricted to fit into 32-bit integers, and are derived from at most the 7 first characters. This improves filter look-up performance as per built-in benchmark().	2019-04-28 10:15:15 -04:00
Raymond Hill	96dce22218	Increase resolution of known-token lookup table Related commit: - `69a43e07c4` Using 32 bits of token hash rather than just the 16 lower bits does help discard more unknown tokens. Using the default filter lists, the known-token lookup table is populated by 12,276 entries, out of 65,536, thus making the case that theoretically there is a lot of possible tokens which can be discarded. In practice, running the built-in staticNetFilteringEngine.benchmark() with default filter lists, I find that 1,518,929 tokens were skipped out of 4,441,891 extracted tokens, or 34%.	2019-04-27 08:18:01 -04:00
Raymond Hill	69a43e07c4	Ignore unknown tokens in urlTokenizer.getTokens() Given that all tokens extracted from one single URL are potentially iterated multiple times in a single URL-matching cycle, it pays to ignore extracted tokens which are known to not be used anywhere in the static filtering engine. The gain in processing a single network request in the static filtering engine can become especially high when dealing with long and random-looking URLs, which URLs have a high likelihood of containing a majority of tokens which are known to not be in use.	2019-04-26 17:14:00 -04:00
Raymond Hill	a52b07ff6e	Make `userResourcesLocation` able to support multiple URLs The URLs must be space-separated. Reminders: - The additional resources will be updated at the same time the built-in resource file is updated - Purging the cache of 'uBlock filters' will also purge the cache of the built-in resource file -- and hence force a reload of the user's custom resources if any Related issues: - https://github.com/gorhill/uBlock/issues/3307 - https://github.com/uBlockOrigin/uAssets/issues/5184#issuecomment-475875189 Addtionally: - Opportunitically promisified assets.fetchText() - Fixed https://github.com/gorhill/uBlock/issues/3586	2019-04-20 17:16:49 -04:00
Raymond Hill	fa83744b58	Use a sequence of base 64 numbers to encode array buffers The purpose of using a custom base128 encoder is to convert array buffers into strings, to allow a direct string-to-array buffer conversion at load time: string => array buffer Whereas a JSON array would require an extra step: JSON array as string => JS array => array buffer Turns out that the current use of a custom base128 encoding results in a significantly larger selfie storage usage when converting array buffers into strings. Speculation: possibly the browser convert the strings to save into JSON strings internally. Since the custom base128 encoder is likely to cause the resulting string to contain a lot of unprintable ASCII characters, these will need to be escaped when converted to JSON -- escaped characters occupy more space than non-escaped ones. Using a sequence of base 64 numbers means only printable will be present in the output string, hence no escaping necessary. I have observed significant reduction in storage usage for selfie purpose.	2019-04-20 09:06:54 -04:00
Raymond Hill	3f3a1543ea	Add HNTrie-based filter classes to store origin-only filters Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/528#issuecomment-484408622 Following STrie-related work in above issue, I noticed that a large number of filters in EasyList were filters which only had to match against the document origin. For instance, among just the top 10 most populous buckets, there were four such buckets with over hundreds of entries each: - bits: 72, token: "http", 146 entries - bits: 72, token: "https", 139 entries - bits: 88, token: "http", 122 entries - bits: 88, token: "https", 118 entries These filters in these buckets have to be matched against all the network requests. In order to leverage HNTrie for these filters[1], they are now handled in a special way so as to ensure they all end up in a single HNTrie (per bucket), which means that instead of scanning hundreds of entries per URL, there is now a single scan per bucket per URL for these apply-everywhere filters. Now, any filter which fulfill ALL the following condition will be processed in a special manner internally: - Is of the form `\|https://` or `\|http://` or ``; and - Does have a `domain=` option; and - Does not have a negated domain in its `domain=` option; and - Does not have `csp=` option; and - Does not have a `redirect=` option If a filter does not fulfill ALL the conditions above, no change in behavior. A filter which matches ALL of the above will be processed in a special manner: - The `domain=` option will be decomposed so as to create as many distinct filter as there is distinct value in the `domain=` option - This also apply to the `badfilter` version of the filter, which means it now become possible to `badfilter` only one of the distinct filter without having to `badfilter` all of them. - The logger will always report these special filters with only a single hostname in the `domain=` option. ** [1] HNTrie is currently WASM-ed on Firefox.	2019-04-19 16:33:46 -04:00
Raymond Hill	a594b3f3d1	Add µBlock.staticNetFilteringEngine.bucketHistogram() as investigative dev tool Additionally, lower the treshold of trieability to 4 for FilterPlainPrefix1.	2019-04-15 11:45:33 -04:00
Raymond Hill	008370e4b9	Fix https://github.com/uBlockOrigin/uBlock-issues/issues/461 uBO will fallback using a JSON string when trying to encode an array buffer in Chromium version 59 and earlier.	2019-03-16 09:00:31 -04:00
Raymond Hill	928ab91ab8	Add support to benchmark the dynamic filtering pane From uBO's dev console, type: - `µBlock.sessionFirewall.benchmark();` Keep in mind that it's the temporary ruleset being benchmarked.	2019-02-19 10:46:33 -05:00
Raymond Hill	ed7e34fb07	Refactor selfie generation into a more flexible persistence mechanism The motivation is to address the higher peak memory usage at launch time with 3rd-gen HNTrie when a selfie was present. The selfie generation prior to this change was to collect all filtering data into a single data structure, and then to serialize that whole structure at once into storage (using JSON.stringify). However, HNTrie serialization requires that a large UintArray32 be converted into a plain JS array, which itslef would be indirectly converted into a JSON string. This was the main reason why peak memory usage would be higher at launch from selfie, since the JSON string would need to be wholly unserialized into JS objects, which themselves would need to be converted into more specialized data structures (like that Uint32Array one). The solution to lower peak memory usage at launch is to refactor selfie generation to allow a more piecemeal approach: each filtering component is given the ability to serialize itself rather than to be forced to be embedded in the master selfie. With this approach, the HNTrie buffer can now serialize to its own storage by converting the buffer data directly into a string which can be directly sent to storage. This avoiding expensive intermediate steps such as converting into a JS array and then to a JSON string. As part of the refactoring, there was also opportunistic code upgrade to ES6 and Promise (eventually all of uBO's code will be proper ES6). Additionally, the polyfill to bring getBytesInUse() to Firefox has been revisited to replace the rather expensive previous implementation with an implementation with virtually no overhead.	2019-02-14 13:33:55 -05:00
Raymond Hill	261ef8c510	Add support for procedural :not to HTML filtering Related issue: <https://github.com/gorhill/uBlock/issues/3683> Additionally, improve compile-time error reporting in the logger	2018-12-15 10:46:17 -05:00
Raymond Hill	5b7a3c9983	fix https://github.com/uBlockOrigin/uBlock-issues/issues/256 ; add regex support in logger filter field	2018-12-14 11:01:21 -05:00
Raymond Hill	cabb0d36b6	fix https://github.com/gorhill/uBlock/issues/3371	2018-10-23 14:01:08 -03:00
Raymond Hill	777144b036	fix https://github.com/uBlockOrigin/uBlock-issues/issues/200	2018-09-03 16:15:51 -04:00
Raymond Hill	8f1b4b52fd	fix #3606	2018-08-09 11:31:25 -04:00
Raymond Hill	7766786b2c	code review: reuse last decomposed hostname (hit rate = 75%)	2018-06-03 13:27:42 -04:00
Raymond Hill	2c843f6e69	code review: chromium 45 supports arrow functions = start using them	2018-06-01 11:49:48 -04:00
Raymond Hill	798f8dab9d	reduce baseline memory at selfie-load time	2018-06-01 07:54:31 -04:00
Raymond Hill	a9f68fe02f	Fix #3069 , and consequently #3374 , #3378 . A new filtering class has been created: "static extended filtering". This new class is an umbrella class for more specialized filtering engines: - Cosmetic filtering - Scriptlet filtering - HTML filtering HTML filtering is available only on platforms which support modifying the response body on the fly, so only Firefox 57+ at the moment. With the ability to modify the response body, HTML filtering has been introduced: removing elements from the DOM before the source data has been parsed by the browser. A consequence of HTML filtering ability is to bring back script tag filtering feature.	2017-12-28 13:49:02 -05:00
Raymond Hill	4ab63e70fe	code review: avoid Array.splice/unshift The array size stays the same, items are just moved around.	2017-12-22 09:37:26 -05:00
Raymond Hill	607968de7f	code review: cache most-recently-used pre-filled scriptlets	2017-12-21 17:05:25 -05:00
gorhill	386e8bee9c	fix #3210	2017-11-09 12:53:05 -05:00
gorhill	6112a68faf	fix #2984	2017-10-21 13:43:46 -04:00
gorhill	9a4681d4e1	fix #2656	2017-05-27 14:31:46 -04:00
gorhill	aae97b8535	fix badfilter option; performance work - badfilter option was no longer working following last refactoring changes. - performance work: - reduce duplication of large strings. - new lighter FilterBucket to use when only 2 filters: FilterPair.	2017-05-26 20:00:21 -04:00
gorhill	8d2319e011	fix "purge all" button not disabled when there is nothing left to purge	2017-05-26 08:31:19 -04:00
gorhill	f3e6057e07	fix #2598 : refactor to address the cause rather than the symptoms	2017-05-25 17:46:59 -04:00
gorhill	fd03683045	minor code review: it makes no difference, I just prefer no indent there	2017-05-20 16:32:42 -04:00
gorhill	acf7562b0f	minor code review	2017-05-19 20:22:26 -04:00
gorhill	fcf43d972e	tentatively fix issue reported in #2612 re. FFox 24.8.1	2017-05-19 10:12:55 -04:00
gorhill	a222e23e49	fix #2630	2017-05-19 08:45:19 -04:00
gorhill	0232382695	refactor static network filtering, add support for csp injection	2017-05-12 10:35:11 -04:00
gorhill	a4e20ae3ad	new filter option: "badfilter" (see https://github.com/uBlockOrigin/uAssets/issues/192 )	2017-03-11 13:55:47 -05:00
gorhill	0b4f31bd8a	fix #2344	2017-01-27 13:44:52 -05:00
gorhill	da163bbe4b	fix #1641	2016-10-13 13:25:57 -04:00
gorhill	b105010f34	minor code review	2016-10-11 11:53:28 -04:00
gorhill	ef0a7ed5cb	code review re. #1997 : be sure the setting is persisted	2016-09-16 19:12:16 -04:00
gorhill	269c35a04a	fix #1997	2016-09-16 17:41:17 -04:00
gorhill	a7fe367eec	refactor where appropriate to make use of ES6 Set/Map (#1070 ) At the same time, the following issues were fixed: - #1954: automatically lookup site-specific scriptlets - https://github.com/uBlockOrigin/uAssets/issues/23	2016-09-12 10:22:25 -04:00
gorhill	e9157bafb7	fix #1892 , #1891	2016-08-13 16:42:58 -04:00
gorhill	a944873b83	code review: convert static filtering's tokenizer to a global utility	2015-12-29 11:34:41 -05:00
Deathamns	95b778fbc7	Change extension description	2015-03-07 19:20:18 +01:00
gorhill	f60f149531	1000 is k, not K	2014-12-24 08:11:22 -05:00
Deathamns	30ef97a678	Fix messaging for Safari	2014-11-09 17:41:07 +01:00
Deathamns	0886f7e886	Add .jshintrc, and use the "use strict" directive .jshintrc's otion-set is a personal choice, merely a suggestion. Beside that, it includes some common globals for specific browsers, so there's no need to set the globals in every .js file. In order to force strict coding, "use strict" directive was added into every .js file.	2014-11-09 17:39:17 +01:00
Deathamns	5b79bf3536	Work on vendor API abstraction, and near complete Safari support	2014-11-09 17:39:12 +01:00